Saturday, June 27, 2009

Social software and media, the virtues of agile development, and the new Iranian revolution


This is a repost of a blog entry created for the OOPSLA 2009 official blog.

The Web is now social. Increasingly, people young and old are spending a substantial part of their social lives by using the Web. Social networking sites like Facebook, MySpace, YouTube, Twitter, and many others, are giving people the ability to maintain various social interactions and connections with friends, family, and acquaintances. Indeed using Facebook one often hears of how people are reconnecting with old friends and making new ones. Twitter has quickly transformed into one of the best means for sharing real-time information on various topics all across the Web. YouTube’s videos can transform the fortune of unknown talented individuals from the most remote part of the world as well as shed light on issues and facts that otherwise would have been completely ignored.


As the world bears witness to the recent unrest in Iran, the power of social media has never been clearer and more manifest. Twitter, Facebook, and Youtube are giving a voice, a face, and a communication channel to the people on the streets of Tehran. All this comes despite the efforts of the Iranian regime to shut down media and reporters across the country. While it is uncertain how this new revolution in Iran will end up and as the world continue to watch intensely, there is one undeniable truth, and that is the unintended impact of social media. As Clay Shirky puts it, social media has enabled the democratization and amplification of the voices of the people...

But what does social media or social software have to do with OOPSLA? Surely they are software systems but what else does the conference bring to help this new wave of Web software.
It is true that social software is about connecting and empowering people and thus appears to be a purely social phenomenon that simply is enabled and constructed using basic Web software principles and approaches... There is, however, an important and subtle aspect that is hard to see on the surface and that has its roots in OOPSLA. In a nutshell it is about agility... Looking deeper at the various social media sites aforementioned, one other thing becomes ample clear. Many of the usages and social consequences of these sites were mostly, if not completely, unintended.

Most of the sites started with initial ideas of connecting people but ended up with emerging usage patterns that are truly powerful and consequential. The creators of Twitter did not set out to create the new voice for modern democratic revolutions---that fact has emerged accidentally. So, the question becomes: how does one create software that can have the type of profound social impacts and successes such as Twitter? There is no “cookie cutter” solution, however, one well known approach (one that was used at Twitter) is to use software frameworks and principles that are in line with the principles of agile software development.

At a recent talk at IBM’s Almaden Research Center, Twitter’s CEO Jack Dorsey was asked by Almaden Ph.D. intern Ajith Ranabahu, if in the light of recent scaling issues that Twitter has experienced, would Dorsey and the original Twitter founders have chosen another language or another framework to create their company. While Dorsey acknowledged some of the limitations of the Ruby on Rails platform they are using, he was quick to say that he would still have made the same framework choice given another chance... Dorsey's reasoning is simply that the key factor is not one of scaling and architecture, but rather of agility and speed of development.

Rails is well known for providing both development speed and agility in bundles. By being able to materialize their ideas quickly, Dorsey and the other Twitter co-founders created a working version of their site in a few weeks which let them observe the initial users and also let them continuously iterate and find the subsequent micro and now macro successes. With each group of users, new patterns emerged and Dorsey and team could quickly iterate, adjust, and malleably modify their software to match the emerging usages and patterns. Without the agile virtues of Ruby on Rails, it is doubtful Twitter would be the phenomenon that it has become today.

As maybe the preeminent incubator of agile software development thinking during the last decade and a half, and the place where agile has gone mainstream, the OOPSLA conference it seems, has been a key enabler for the Agile movement that has inspired frameworks like Rails and indirectly sites such as Twitter. The agile practices of test-driven development, pair programming, continuous integration, and the SCRUM team organization approach, have roots either directly or indirectly at OOPSLA, or have progenitors that frequently attend and participate at the conference. And if we go even further, the virtues of rapid, instant, prototyping and having “the customer as the driver”, like Kent Beck likes to say, now may be taking their natural course in social media and crowd sourcing of content.

And while social software is undoubtedly a phenomenal success, there remains some serious challenges, and this also is where the OOPSLA community could help further. First, there are the programming challenges. The amplification attributes of Twitter and Facebook occur because the Web is now programmable. Using APIs and simple scripts, it is easy to create aggregation points, as well as new data sinks and sources for information flowing through the social media channels. This is how a tweet from the streets of Tehran can flow, be retweeted, and end up, almost instantaneously, on the television screens of millions in the United States and Europe. The challenge is making it quick and easy for anyone to collect, filter, and aggregate social media information.

Second, and perhaps most importantly, are the challenges around the provenance of the information that is flowing through the social media channels. Now that everyone has a voice, it becomes increasingly difficult to discern authentic voices from those trying to manipulate the system. Here, research in data provenance, data mining, and data filtering for the massive amount of realtime and streaming data is key. Realtime and stream programming pose significant and fundamental challenges that beg for systems, frameworks, and programming language help.

Finally, as in all computing for open systems (such as the Web), the concerns of privacy and security remain paramount. It is now well accepted that addressing these persistent issues cannot be done after the fact, but are aspects that must be addressed at the early stages of development. There is a clear need to share best practices and uncover patterns to help overcome these challenges...

So while social media and social software are helping transform the fabric of social interactions from the hills of Silicon Valley to the bars of Austin, to the cafés of Paris, and to the streets of Tehran, remember that many of these social consequences were not planned, but rather emerged from the resulting empowering software that is itself possible due to the virtues of agile software development and practices... And together with the community that produced and helped agile practices go mainstream, we can help address some of the important remaining social media challenges so that the new voice of the people can persist, remain strong, and authentic.


References
Go here to watch Dorsey's talk at Almaden and Ajith's question in toward the end.

Updates
1. Initial post on 06/27/2009

Tuesday, June 9, 2009

Cloud computing - a programming perspective


This is a repost of a blog entry created for the OOPSLA 2009 official blog.

Cloud computing is the “new hot” topic. Simply put, various business pressures, a multitude of pain points, and the maturity of a series of Web technologies (networking, APIs, and standards) have made it possible and cost-effective for businesses, small and large, to completely host data- and application-centers virtually... in the cloud, if you may.

Cloud computing providers, e.g., Amazon, reuse their expertise in efficiently managing and hosting their own Web systems and applications, and expose that core expertise as a set of Web APIs. Using the Amazon Web Services Elastic Compute Cloud (EC2), anyone with a credit card and some programming can provision a server instance and install a Web application on it and thus immediately have a presence on the Web. Using economies of scale for server hardware combined with virtual machine technologies, data- and application-centers automation expertise, as well as extensive instrumentations, Amazon is able to provide that service globally for pennies at the hour. There are no binding contractual agreements and Amazon will only bill you for the hours you have used.

In addition to compute instances, Amazon also provides various other compute resources on their cloud platform, e.g., storage (file and block), message queues, batch data processing, and others. Following Amazon’s lead, various companies, including Google, IBM, and Microsoft, are also exposing frameworks, services, platforms, and applications to a world-wide audience from within a Web browser and with simple Web APIs. Cloud computing is no less than a democratization of compute resources. With cloud computing, vast compute resources no longer require huge and long-term investments but instead can be had and consumed, as Amazon chairman Jeff Bezos, like to say, “by the drink”.

Whether cloud computing will fulfill the high-expectations that many are advocating is still to be determined. Various challenges remain and, in our opinion, we are reaching the peak of the typical hype curve that new technologies follow. However, regardless of whether cloud computing will be a bust or continue to be the hit that it has certainly been so far, there is one undeniable truth that some seem to ignore... The current success of cloud computing and, we believe, its future successes, are heavily tied to how easy the cloud and cloud applications are to program as well as to maintain and to scale. And this is precisely why OOPSLA matters to cloud computing advocates, users, and providers and vice-versa.

As we mentioned, with the cloud, computing resources are cheap and widely available. In a matter of minutes, one can provision 100s of server instances on the Amazon EC2 cloud along with terabytes of storage and more aggregate MIPS than what is available on most recent mini-computers. All of this for around $10 an hour. While most anyone could afford such computing capacity at these price points, what is hard for most is to take advantage of that cheap capacity. The problem is no longer one of provisioning the resources, but rather one of taking advantage of these resources and of efficiently doing so.

We are at the beginning of a new evolution of programming. One that is taking place with this move to cloud computing. For lack of a better moniker, we call it cloud programming. It is about being able to scale programs to take advantages of these on-demand cloud compute resources. Programming distributed nodes of computation has always been one of the classic ongoing problems of computer science. The cloud, it seems, has thrusted this problem and associated corollary issues to the forefront...

While cloud programming has some resemblance to old-style distributed programming or super computing or multicore programming, it is a different problem due to the changes in the core assumptions and constraints. On the cloud, most compute resources are essentially server instances with virtual compute capacities or virtualized services. The network is the Internet and assumptions about co-locations, latency, and errors cannot be made. The same concerns one has with real servers in your data centers also still persist. That is, securing, upgrading, automating, and managing these virtual instances are still very much part of the programming that one must do to reap the benefits of new cloud infrastructures. Scripting languages, e.g., Ruby, Python, and Groovy, are already taking center stage to solve some of these issues.

Additionally, now that storage can shrink and grow on demand and for very low costs, while keeping reasonably good qualities of service, the other issue is how to manipulate the vast amount of data that one can now store. Google had a similar concern years ago as it improved its search engine while managing expenses in growing its data centers to match the unprecedented growth of the Web. Google engineers and scientists cleverly figured out how to parallelize data computation over large clusters of cheap and replaceable compute nodes. The MapReduce programming model is specifically designed to help engineer algorithms that can scale and run on the resulting big data that one now accumulates...

Programming for massive scale is the key challenge. We firmly believe that new styles of programming, new programming frameworks, and new programming languages may be one of the key sources of innovations for the cloud. Imagine when cloud frameworks and cloud programming environments provide, in near real time: the cost, the energy impact, and the automation facilities that a cloud computing infrastructure enables. Plus now imagine being able to program these multiple cloud nodes either in batch or in real-time, while satisfying best practices of Web security and privacy. The combined results would be the nirvana of Web programming. Scaling automatically your compute resources in a cost-efficient and environmentally friendly fashion while managing the resulting deluge of data and potential influx of users...

Surely there are many PhD theses to be had to help address some of the fundamental scientific and engineering issues involved in achieving such an idealized state of Web computing. In some ways we maybe vastly simplifying the issues and that many of the challenges involved have been studied in various branches of computer science and software engineering for the past 30 years. However, the point here is not to claim that cloud computing is the assured next wave of computing, we don’t know; but rather, we would like this post to simply serve as a reminder that the various issues in system, data, and distributed computing that cloud computing brings to the forefront could be addressed from innovations in frameworks, programming styles, and programming languages... OOPSLA, it seems, from its long historical track record of ground breaking innovations in this space, may be a natural choice for the genesis of some of these new future eureka moments.


Updates
06/01/09 - fixed typos: accumulate => accumulates

Sunday, May 31, 2009

Why OOPSLA matters?


This is a repost of a blog entry I created for OOPSLA 2009 official blog

These days, we all take for granted that software is best built incrementally, that testing while coding leads to better quality software, that virtual machine-based languages can be as fast as natively-compiled languages, that patterns are great way to bootstrap your thinking when designing, and that an object-oriented language with single inheritance is likely easier to deal with than one with multiple inheritance...

Many of these well-accepted tenets in the software industry and programming trade have their roots in one conference. A conference that started with a band of early programmers who were passionate about a powerful new style of programming: object-oriented programming. That style has evolved over the years to become a source of innovation for all things programming and software. Indeed, most of the assertions above can be traced back to their origins in papers, workshops, or ideas stemmed from that conference: OOPSLA. Such is the legacy of this conference.


Times and technologies change. That fact has implied that every year OOPSLA had to introspect and look for ways to rejuvenate and encourage exploring boundaries of software. The inventor of Self, Dave Ungar, likes to state it simply as always "question your assumptions."

What are OOPSLA’s basic assumptions? Well, over the years it has been a conference about software languages, software development, software development methodologies, and software systems. Should this still be our focus? Software is embedded everywhere and the success of devices like the iPhone and Blackberry are good indication that at least one immediate future of software is in mobile solutions that include a combination of hardware and software within an ecosystem (private or public).

The Web has also transformed our social lives and is increasingly a communication fabric unparalleled in scope, reach, and immediacy. Web services like Twitter and Facebook have transformed the Web into a real-time virtual social square. Information is flowing quickly and at ever-increasing volumes. This social software is not only near real-time and location-aware but it is also interconnected with complex executable logic. Mashups of Web APIs and data have led to a boom of innovations analogous to early days of commerce on the Web.

The current Web not only has resulted in the democratization of information and applications, but increasingly it is the gateway to reaching every business’s data centers and application centers. Using Web APIs, a startup can run its entire operation virtually on cloud computing infrastructure without concern for acquiring sufficient compute resources to scale should that startup become the next overnight success---that is, if they are TechCruched, Digged, or Slashdotted.

With so much happening around software and the Web, why should someone from academia or industry still attend an object-oriented conference?

This is an important question. It is one that cannot be completely answered in one blog post. However, I will give you a short answer now and elaborate each point over the next three months in various blog posts and podcasts. I hope to convince you that OOPLSA matters. It matters to both academic and industrial participants. It matters because of its tutorials, its workshops, its keynotes, all of its leading-edge content. Most of all, it matters because of the world-class people who regularly present their new ideas at OOPSLA. As you will see when we announce the program, all of the hot topics mentioned above will be represented in some fashion in this year’s program...


What makes any conference really worth while is the quality of the people who attend. OOPSLA has a tradition of attracting the best and most innovative students, professors, consultants, industry researchers, and practitioners. This year will be no different. Come to OOPSLA 2009 and you are sure to meet with members of the gang of four, the instigators of the Agile movement, the creators of the Web’s hottest languages and frameworks, as well as hear from researchers and practitioners at the leading universities and companies.

Yes, "the times they are a changing”. But just as Bob Dylan will forever have a certain “je ne sais quoi” that makes his music pertinent, classic, and always filled with relevant content and meaning. So too will the OOPSLA conference. As long as we keep welcoming a core group of innovators, keep including new topics in tutorials, workshops, keynotes, and keep attracting the quality content that you will hear when we announce the program, the conference’s future is very much assured and alive.

Check back frequently for other posts as we peel away at this year’s program and demonstrate why OOPSLA matters to you.

Updates
06/01/09 - added link to OOPSLA 2009 blog entry

Thursday, May 14, 2009

Web 2.0 Security & Privacy 2009 @ Claremont Resort in Berkeley on May 21st


Repost from colleague Larry Koved of IBM Research. Tyrone Grandison, Kun Liu, Tony Sun, Sherry Guo, and Dwayne Richardson, and I have a short paper at the workshop entitled "Privacy-as-a-Service: Models, Algorithms, and Results on the Facebook Platform". Find PDF on the workshop's Web site. Join the Facebook page for attendees and news.



Reminder: One week until the workshop.

Web 2.0 Security & Privacy 2009
Claremont Resort in Oakland, California
May 21, 2009

http://w2spconf.com/2009/

The goal of this one day workshop is to bring together researchers and practitioners from academia and industry to focus on understanding Web 2.0 security and privacy issues, and establishing new collaborations in these areas. This workshop is the 3rd in a series of successful workshops on this topic.

Registration is now open. See the main conference web site for registration information: http://oakland09.cs.virginia.edu/ . (You may register and participate in the workshop even if you are not attending the 30th IEEE Symposium on Security & Privacy.)

If you would, please pass this information on to your colleagues who may be interested in this workshop.

Friday, February 20, 2009

WebGuild blog is flagged as malware!

The WebGuild.org blog is an interesting Silicon Valley group and blog that holds no bar when it comes to latest news from valley tech companies that may not be flattering to them. The group organizes various conferences and seminars on hot tech topics such as Web 2.0, cloud computing, mashups, ad modern Web user interfaces.





For instance, they were one of the initial blogs to note that Google was quietly laying off 1000s of temp workers... The blog post had noticed discrepancies in Google's current 'workers figures' in their regular investor data. They noted IBM's also quiet resource actions apparent use of age for decisions, as well as Hewlett Packard recent decision to have pay cuts across the company.

A more pointing news item in the HP WebGuild post was noting that HP's CEO pay last year included about $79,814 of tax for food consumed! Working backward, this means about $243,000 in food expenditure. Wow, even as a non HP-investor, I find this news appalling.

So I am a bit surprised that yesterday's WebGuild.org email summary of blog entries links all were being flagged as malware.


What's causing this, is WebGuild's opinion and analysis so uncomfortable to companies that they are now being targetted? I don't know and would hate to spread rumors, however, part of what makes democracies work is check-and-balances. WebGuild is part of that balance in the Tech industry, agree or not with their tactics and news reporting, their voice should not be shutdown nor silenced nor controlled.

Wednesday, December 10, 2008

Ruby on Rails 10 Tips - Part 1: Development


I was recently asked by some colleagues for advices and best practices developing, securing, and deploying Ruby on Rails applications. While I don't have full fledge best practices, I do have lots of tips to share. These are not as thorough as best practices or well structured as patterns, but are instead short, to the point, common knowledge and not so common knowledge that I found useful in my experience developing Rails applications.

Also, most of these tips come from my either my own experience, discussion with others in the community at Meetup and RailsConf, as well as reading various blogs and books on the subject. I apologize in advance if I have not linked to your blog where you have similar tips. Feel free to leave a comment and I will do so.

Finally, I love a great debate. If you disagree with any of these tips then by all means post a comment and let me know what you disagree with and why. I know at least one colleague at IBM Research, Chet Murthy, who is experienced in Rails development and disagrees with me on at least one of these tips. I welcome your feedback Chet.

Enjoy!

10 Rails Development tips

  1. Use ActiveResource and avoid ActiveWebService. With ActiveResource and multiview support in Rails 2.x you can easily expose RESTful models as well as Atom/RSS feeds on these models as well as JSON and any other view format you can wish for.


  2. Consuming ActiveResource is easy by using self.site = "URL" in the client side. However, that does not add data to the DB and every query will result in a REST call. Caching data is key.


  3. Use Rake to automate any other tasks you do. Custom Rake tasks are easily added in lib/tasks. Use $rake -T to see current tasks available, including yours.


  4. Use database migrations to update models once your design is solid and you have a first release and cannot avoid data losses. That is, to be clear, use one migration per model early on and then once app is released for beta, every change to the models should be via new migration (not an update to the old migration). This will save lots of headaches in future and allow you to easily move application from one version to next one.


  5. Use Solr vs. Ferret for models searchability and associated acts_as_xyz plugins. This is due to the fact that Ferret indexes tend to get corrupted. This, in some sense, is a shame since Ferret is a nice and easy plugin.


  6. Make sure to use Rails validations in your models. I would avoid special DB statements in migration code. This makes it easier to move to different DB, e.g., MySQL to DB2.


  7. Use view partials to keep your views DRY. Essentially, partials should be any view code that is repeated, similar to a subroutine call (PullUp or PushDown method refactoring). Use a app/views/shared directory for partials that are across controllers. Also, always use the controller (or shared) name when calling the partial. This will help you avoid view errors when refactoring views into shared.

    So instead of:



    do instead


    When not specified the controller is assumed to be the one from the calling view. If you reused funky_partial in a view coming from a controller other than my_controller, you will get a view partial file not found exception. Easily fixed but easily avoidable as well.


  8. Don't add new actions to ActiveResource models, other than the basic CRUD operations' actions. Instead use a separate controller that groups actions related to one or many ActiveResources, e.g., don't add login/logout action in User model if that's an ActiveResource, and so on. This is because you will need to add special routing maps in your config/routes.rb and that will be hard to maintain. ActiveResouces are RESTful by design and the actions added with scaffold are good enough. Use $rake routes to see all your routes (more on this point later).


  9. Refactor common code as plugins after you notice it being used many places and applications. This is especially true for moving common model code into acts_as_xyz. This is harder to do when you first start but becomes easier after a few apps and you have done one. See this RailsOnWave tutorial for a intro guide to creating your own acts_as_xyz plugins


  10. Tests, test, test. Whether you test first or test last or in between, in many ways what matters is the fact that you are testing. Doing TDD means having some sustained development flow that always include tests. If your code base is growing one sided (that is tests are not growing in same rate as code) then something is wrong and you will pay for it in the future. Use $rake stats to get a feel for your current code/test levels.


Next I'll list my top 10 list of tips for Rails application deployments and scaling.

Monday, August 11, 2008

OOPSLA 2008 tutorial program

OOPSLA 2008 logoI wrote this post originally for the OOPLSA 2008 blog. However, to help reach a broad audience, I am taking the liberty to replicate it here. After all, I'll be giving my Web APIs on Rails: using Ruby on Rails to create Web APIs and Services Mashups tutorials at OOPSLA 2008 on Wednesday October 22nd. Sign up if you want a hands on, fast track, and fun overview of Ruby, Rails, Web APIs, and mashups.





While OOPSLA is well known for advancing the fields of computer science, software engineering, and their practical applications---after all, Java, Eclipse, Aspect-Oriented Programming, Design Patterns, Agile Methods, just to name a few, got their start at past OOPSLAs---another important aspect of the conference is the wide-range of tutorials available to novices and experts alike.

Continuing this tradition, OOPSLA 2008 boasts a tutorial program unlike any I have seen. There are more than 40 tutorials covering advance, new, novel, and mainstream topics such as Agile, Test-Driven Development, Domain-Specific Languages/Modeling, Microsoft's F#, advanced C++, Google's Guice, SOA, Apple's iPhone SDK, and Ruby on Rails, to name a few.

Tutorials are offered daily throughout the conference and are given many times by the creators of the technologies in question or by worldwide experts in the fields.

Don't miss learning some new skills, sharpening old ones, or get a head start on tomorrow's next big thing. Register today for OOPLSA and add a tutorial or two to your OOPSLA conference experience.