It’s been a busy week of announcements from cloud platform vendors. Amazon announced RDS, their MySQL-based relational data service, lower pricing on their EC2 compute service, new new higher memory and capacity EC2 instances. I see RDS as a welcome addition and very complimentary to Amazon’s SimpleDB service.
SimpleDB provides simplicity and infinite scalability (relatively), but that comes with some big compromises – the biggest being eventual consistency and no transactional integrity. Eventual consistency means data updates are not reflected immediately – they propagate over time (usually under 5 mins), which can create some unique challenges for transactional applications. Without transactional integrity, you can’t be guaranteed that a set of related updates are applied together, which creates the risk of data corruption.
RDS, on the other hand, provides all the advantages of a traditional relational database (MySQL, specifically), but comes with the cost of complexity and scalability. Amazon does reduce a significant amount of the complexity and scalability issues with RDS. They provide all the generic database administration services, including backups. And they provide the ability to scale both CPU and storage capacity with simple API calls. But there is a limit to how high an RDS instance can scale, at which point you have to manually resort to horizontal scaling techniques like clustering and partitioning – which are not automatically supported by RDS. While both RDS and SimpleDB have limitations, used together they offer a very powerful and flexible solution.
Meanwhile, in an email to Windows Azure CTP (Community Technology Preview) participants, Microsoft announced plans to transition Windows Azure from a CTP to a commercial offering by February 1st, 2010.
- At PDC 2009, on November 17th, 2009, a number of new features in Windows Azure will be made available for the first time. The CTP will remain open through December 31st, allowing you to experiment with the full feature platform and to give us any final feedback.
- Beginning January, 2010, new customers will have to sign up for an offer to access services on the Windows Azure platform. You’ll receive your first bill with a $0 balance, so you can see your exact usage while still enjoying free service.
- On February 1, 2010, we will begin charging customers for using the Windows Azure platform.
I’ve been surprised how long Microsoft held off the official release of the commercial Azure platform, meanwhile loosing market share to Amazon and others. I’ll be interested to see what is released in November and how their pricing compares to Amazon.
I’ve been looking into semantic web services to extract key terms and concepts from user-generated content. Calais and Zemanta both offer rich web services, designed to help you find and integrate relevant and related content from around the web. For my purposes, I’m just interested in the term/concept extraction – which is just a small part of what they provide. Yahoo! has a much more basic service designed to do just that, appropriately named the Yahoo Term Extraction Service.
I decided to do a quick evaluation/comparison, using the following text, from one my delicious bookmarks:
Online Communities: Establishing a Community’s Culture – Online Community Report
We initiated the Online Community Culture study in October of 2008, as part of the ongoing research agenda of the Online Community Research Network. The intention of the study was to get a broad look at the factors that influence online community culture, and the steps community managers and strategists take in cultivating, and in some cases influencing, a community’s culture. We had over 75 participants in the research, representing many sectors, including software, tech, traditional media, social media and online community, and non-profits. Respondents seniority skewed towards Manager (44%), Directors & VP’s (12%).
The results from each were quite different. Calais and Zemanta both seem to have more “semantic intelligence” and were able to focus in on the terms that were most relevant to the subject. Calais offered a short, but all relevant list of terms – all extracted directly from the text. Zemanta offered a broader set of terms, including some related terms not explicitly in the text, such as “social network” and “community management”. Unfortunately, it also included some unhelpful terms, such as “computers” and “on the web”. Yahoo! provided the broadest list of terms, but also the least helpful. With all the resulting terms extracted directly from the text, Yahoo!’s service seems to be mostly a semantic parser, with the least semantic analysis. However, Yahoo’s simplicity can be valuable, as well. With other examples, I’ve seen Calais and Zemanta come up empty (no terms), while Yahoo! provided some relevant, and some not-so-relevant terms. As with people, too much intelligence can be problematic.
Unfortunately, none of the services consistently provide ideal terms. But combined, you might get decent results. That’s something I’m continuing to explore. For those interested, the resulting terms from each service are below.
Calais:
- Online Community Research Network
- social media
- online community culture
- online community
- Online Community Culture study
Zemanta:
- Virtual community
- Social media
- Online Communities
- Computers
- Non-profit organization
- On the Web
- Community Management
- Social network
Yahoo!:
- culture study
- community culture
- community managers
- research agenda
- ongoing research
- strategists
- seniority
- respondents
- vp
- intention
- sectors
- non profits
- participants
- community research network

