Twitter github

Posts Tagged with “apache”

@ApacheParquet Graduating and Mesos with Siri

The last week for me has been fun in open source land outside of me getting two of my wisdom teeth pulled out of my face. On the bright side, I have some pain killers now and also, two notable things happened. First it was nice to finally graduate Parquet out of the Apache Incubator:

It’s been a little over two years since we (Twitter) announced the open source columnar storage project with Cloudera. It’s a great feeling to see a plan come together and see this project grow over the years with 60+ contributors while hitting the notable achievement of graduating out of the Apache incubator gauntlet. If there’s any lesson here for me, it’s much easier to build an open source community when you do it an independent fashion with at least someone else in the beginning (thanks Cloudera).

Another notable thing that happened was that Apple finally announced that they are using Mesos to power Siri’s massive infrastructure.

In my experience of building open source communities, there are usually your public adopters and private adopters. There are companies that wish to remain private about the software they use at times and that’s fine, it’s understandable when it can be viewed as a competitive advantage. The challenge is how you work with these private adopters when they use an open source project of yours while wanting to collaborate behind the scenes.

Anyways, it’s a great feeling to see Apple opening up a bit about their infrastructure and open source usage after working with them for awhile. Hopefully this is a sign of things to come from them. Also, it would be nice if Apple just updated Siri so when you ask what Mesos is, it replies with a funny response and proclaims her love of open source infrastructure technology.

Overall, it’s been a great last week.

Apache and Politics Over Code?

Mikeal Rogers just wrote a fascinating blog post, Apache considered harmful.

I have a lot of respect for the Apache community but I’m glad that someone is calling them out finally. The Apache community likes to pride itself on community over code but what has been happening recently regarding the move to a distributed version control system is either pure politicking or negligence in my opinion.

You would have to be under a rock if you haven’t noticed the change both distributed version control and in particular Github has brought to the open source world. Can you name any other major open source project (besides Apache) that is not on some form of distributed version control or has a concrete plan to move? No, I can’t at least off the top of my head. This is because the times have changed, open source projects are more mainstream now and they especially favor distributed forges like Github.

Let’s try to have some fun with statistics. From a recent presentation by Stephen O’Grady from Redmonk, Github’s growth is almost unbelievable…

I’m confident if he updated the excellent presentation again, it would further show the distance between Github and the other forges. Heck, even throw in Bitbucket (Hg and Git now) and Launchpad (Bzr) to see how fast they are growing compared to the others. Another statistic we can look at to further spot this trend is package statistics from Debian…

That’s impressive growth for Git but still shows that SVN is doing OK (poor darcs). It would be great to see more download statistics but I can’t think of other easy sources at the moment. We can also analyze search volume via Google Trends to see what people are searching for over time…

Clearly git (including github) and mercurial are trending upwards. I mean, one could argue that this is because git and mercurial are harder to learn so people are searching more for it, but I doubt that’s the complete story. I didn’t include cvs (famous U.S. pharmacy) or bazaar (ambiguous) because they are searched for in other contexts and I don’t know how to tweak google trends. While doing these searches I wanted to test another hypothesis of mine. From personal experience, I believe that in the corporate world, distributed version control adoption is lagging. The main reason for this line of thinking is that corporations are obviously slower than open source communities in adopting new technologies. To test this theory, I used Indeed to perform a search and see how things are going…

From the looks of it, CVS/SVN are still the dominant players with Clearcase hilariously staying somewhat constant over time. However, I’m sure this graph is going to look quite different in a couple of years as the tools around distributed version control systems mature. I also believe developers will start asking for a form of distributed version control while experiencing it in the wild (see git-svn). I was curious to see if LinkedIn had anything to help shed some more insight of what is going on in the software industry and found their LinkedIn Skills application. I couldn’t find a good way to group and compare relative skills but I found some interesting information. In terms of relative growth, git seems to be trending well…

In terms of skill size, svn is still doing well.

I was curious to see how CVS was doing also…

CVS is experiencing negative skill growth and then I noticed CMVC in the trends which reminded me of bad times and I knew it was time to stop digging for statistics.

Why do I care? Two main reasons. The first is simple and deals with my day job of facilitating open source efforts at Twitter. If you’re going to open source a new project, the fact that you simply have to use SVN at Apache is a huge detterent from even going that route. It would be easier to simply host the code at Github or a similar forge and take what lessons you need from The Apache Way. There’s a lot of tools available to help you with the infrastructure of your project (i.e., you can use Cloudbees or Travis CI to help you with continuous integration). The point here is that continuing to use SVN is not going to help Apache grow. When is the last time you heard a developer all excited about using SVN?

Another reason is that I have personal experience with this particular issue as I spent the last couple years helping the Eclipse Foundation transition towards git. It’s a large transition because there’s roughly 1000 committers and over 200 projects using a mix of CVS and SVN. On top of that, it took convincing the EGit/JGit projects to move to and a couple board meetings and votes to make that happen. Furthermore, the git tooling had to get up to snuff before the majority of projects started to adopt git since the previous generation of SCM tooling (e.g., CVS) spoiled Eclipse developers. All I’m saying is that it took a lot of work to start the transition and the eclipse community hasn’t even fully completed it yet. Just ask the PostgreSQL community how quick it was moving to Git. The key point here is that you have to start the transition soon as it’s going to take awhile for you to implement the move (especially since Apache hosts a lot of projects).

In the end, I’m a huge fan of the Apache Foundation and The Apache Way, as a lot of us have benefited and learned from Apache in some fashion. I just hope the Apache community learns to evolve or they will become less relevant in the new open source world order of distributed version control systems and the forges behind them. I take this problem to heart because I believe The Eclipse Foundation faces some of the same issues and we’re doing our best to mitigate them.

Openfire switches to the Apache License 2.0

I was having my morning caffeine fix and saw a tweet go by…

OpenFire goes Apache 2.0

I quickly went to the front page and noticed it hasn’t been updated yet:

However, this is interesting news to say the least. When established open source projects backed by some company switch licenses, there’s usually a business model change afoot. Or was there pressure from Google to switch the license since Google Wave uses Openfire under the covers>?

Thankfully, Matt Tucker’s explanation hints at some of the reasons:

I’m happy to announce that Openfire will be moving to a more liberal open source license — Apache 2.0. Apache 2.0 provides significantly more flexiblity than the GPL in virtually every way, so it should be a big win for the community all around. We expect to get all the source code headers updated for the next release. There were several motivations for making this change:

  1. The GPL license was preventing some companies from using Openfire due to corporate policies
  2. There was no reason to keep using GPL and end-users generally seem to prefer Apache
  3. We’d like to encourage a broader range of commercial companies to contribute to the project and the Apache license is a good way to help make that happen

Would be happy to answer question or comments.

It was delightful to see the Openfire community be notified of the license change, with reasons why and a request for feedback. Scratch this one for the proper way of doing a license change in an open source community. In the open source world, we hold transparency sacred. Don’t be like switched from EPL to GPL and told no one until it was too late.

It’s important to be upfront and transparent.