Groups on Glassboard- ISV Case Study

Glassboard is a mobile service for sharing privately with groups. With Glassboard, you create ‘boards’ which are groups of people around a common interest where you can share messages, comments, photos, videos, and even your location (when appropriate).

When the Glassboard team was developing Glassboard, they looked for a platform that offered scale, worry-free operation, replicated data, and privacy for their apps on Windows Phone, iPhone, Android, and Office 365. The Glassboard team chose the Windows Azure platform for these capabilities.

This article provides and overall architecture of the Glassboard backend in Azure.

Architecture Overview

The architecture of Glassboard includes native smartphone and desktop applications, a Web Service to handle requests, a data store to hold the messages and videos, and a notification service to alert users of incoming messages.

sepia labs azure architecture

Glassboard Architecture

Client Architecture

Each mobile platform has a native application. The Windows Phone 7 app is in the Windows Phone Marketplace, the Android & iPhone apps are available in their respective stores, and the Silverlight desktop client is available through the Office 365 Marketplace. Each of these clients makes calls directly to the custom Glassboard backend built in Azure.

Services Architecture

Each of the phone applications talk to Web Services hosted in Windows Azure. The connections are all made over SSL, and they use digest authentication. Authorization is provided by a custom-built service of user names and passwords. Once the user is authenticated through the Web Service, a token is returned to the device that is attached to subsequent service requests.

WCF Services Web Role

The Web Service is written using REST on Windows Communications Foundation (WCF) that is part of .NET 4. WCF services receive and validate the user log in, and receives the messages, pictures, and videos. The Web Service is hosted in a Windows Azure Web Role. You can think of a Web Role as Windows Server virtual machine that includes Internet Information Services (IIS).

Data Store

Incoming messages from the apps are stored in Windows Azure Table Storage as entities

The message is encrypted prior to writing the data into the table store. The Glassboard team used Azure Table Encryption by Attribute that is available in CodePlex. A single attribute is used to transparently encrypt and decrypt data when saving or reading data from Azure Table storage. The technique ensures that “data at rest” will be encrypted. This means no external parties can read your content – not the authorized infrastructure personnel at your company, the Glassboard team, nor anyone at Microsoft will not be able decrypt this data. Incoming pictures and videos are stored in Azure Blob Storage in the same way.

Encoding Videos Using Queues and Worker Role

Videos need to be encoded so they can be seen on each of the other devices. The Web Role receives the incoming media file and stores the file in Windows Azure Blob Storage. Then it places an item in Windows Azure Queue Storage to alert the Worker Role that encoding is needed. The Worker Role queries to queue, which acts as a “to do” list tasks that can take several minutes.

Think of a Worker Role as a service that runs in a never-ending loop. When the Worker Role cycles, it checks for an item in the queue. The item in the queue points to the location in blob storage that needs to be encoded. The Worker Role picks up the item, marks the item in the queue as being in progress, and begins its work. If for some reason the encoding fails, the item in the queue is set back to restart the encoding. Or when the encoding is completed, the Worker Role removes the item from the queue.

As the Worker Role completes encoding each video, it places the media into Windows Azure Blob Storage that can then be sent in chunks to the users.

Notifications

Notifications are also handled by the Worker Role. When the Worker Role detects that an new notification is to be sent, the notification is made to the smartphone provider. In Windows Phone, the notification is sent using Push Notification Services.

For Windows Phone devices, the Glassboard app running on the phone intially requests to receive a push notification URI from the Push client service. Through a negotiation with the Push service, it receives a URI that identifies the device. The client software sends that URI to the Glassboard Web Service.

Glassboard maintains a list of these URIs in Windows Azure Table Storage. When it comes time to notify each device, the service walks through the user network, gets the URI for each member’s device, and sends a push notification to the Microsoft Push Notification Service, which in turn routes the push notification to the application running on a Windows Phone device.

For Windows Phone, the notification is delivered as raw data to the Phone application, the application’s Tile is visually updated, or a toast notification is displayed. The Microsoft Push Notification Service sends a response code to your web service after a push notification is sent indicating that the notification has been received and will be delivered to the device at the next possible opportunity. However, the Microsoft Push Notification Service does not provide an end-to-end confirmation that your push notification was delivered from your web service to the device.

Other devices have a similar service.

Additional Clients

Because the primary interface into the Glassboard service is a Web Service, Glassboard messages can be integrated into other applications. For example, you can access Glassboard messages from within NewsGator Social Sites. NewsGator integrates Glassboard into Social Sites so users can see mico-blogging messages and links from within an enterprise.

Windows Azure Advantages

With this architecure, Windows Azure provides advantages of storage replications, scale and provides service guarantees.

Storage

Windows Azure Blobs, Tables and Queues stored on Windows Azure are replicated three times in the same data center for resiliency against hardware failure. No matter which storage service you use, your data will be replicated across different fault domains to increase availability and be fault-tolerant.

Windows Azure Blobs and Tables are also geo-replicated between two data centers hundreds of miles apart from each other on the same continent, to provide additional data durability in the case of a major disaster, at no additional cost.

Scale

The loosely coupled architecture provides the ability to scale. When many users connect Glassboard can increase the number of Web Roles to process the incoming messages. When the number of videos and notification out paces the ability for the Worker Role to do the encoding, additional Worker Roles can be enabled.

Because the architecture is loosely coupled, compute cycles can be added as needed.

Conversely, during late evenings when messages and videos are fewer, the system can scale back to a small number of compute instances ready to receive the next message.

Using Windows Azure Queue allows decoupling of different parts of a cloud application, enabling cloud applications to be easily built with different technologies and easily scale with traffic needs.

Privacy, Security

Security on Windows Azure is a shared responsibility. Glassboard provides a high degree of security for the individual messages because it takes advantage of application best practices. Traffic between phones and the Web Service are encrypted using https and confidential data is encrypted prior to storage.

Glassboard users can provide location information and photos without sharing those details with others, even the Glassboard staff has no access.

Windows Azure operates in the Microsoft Global Foundation Services (GFS) infrastructure, which is ISO 27001-certified. ISO 27001 is recognized worldwide as one of the premiere international information security management standards. Windows Azure is in the process of evaluating further industry certifications. In addition to the internationally recognized ISO27001 standard, Microsoft Corporation is a signatory to Safe Harbor and is committed to fulfill all of its obligations under the Safe Harbor Framework.

Get Glassboard

For iPhoneAndroid, and Windows Phone 7.

Additional Resources

About Web Roles and Worker Roles
About Security on Azure

Also see Building a Massively Scalable Platform for Consumer Devices on Windows Azure in MDSN Magazine.

Special thanks to Walker Fenton and Brian Reischl.

 

Bruce D. Kyle 
ISV Architect Evangelist | Microsoft Corporation

Advertisements

Mastering the Cloud Deployment

What does a cloud computing expert need to know? In part one of the cloud interview guide, we covered some basic unix & Linux systems administration skills, and cloud computing and infrastructure concepts. Those are key starting points.

In this second part, let’s dig into deploying applications in the cloud, and day to day operations skills. There’s a lot of material here. We recommend picking a few questions out of the bunch and focusing on those questions, rather than trying to cover all of them.

1. Deploying in the Cloud

Deploying applications into virtual or cloud datacenters involves understanding and evaluating providers. Many just deploy on Amazon EC2 as it is far and away the largest cloud hosting solution, with the most robust offering.

o What sets Amazon apart from the other cloud providers?

There are probably two things that set Amazon apart from other cloud infrastructure solutions. EBS or elastic block storage being one. Although the others have storage solutions, and Rackspace is working on their own virtualized storage, Amazon seems to be the furthest ahead with their offering. It is fully virtual, allows arbitrary chunks of storage to be attached to instances, and allows instances to boot of ebs volumes.

The other major point is that since Amazon has grown so large, so quickly, it has more datacenters, in more geographically dispersed areas than other providers. Since these are organized into logical resources, and can be accessed through API, it makes your application infrastructure truly virtual.

o What are some other large cloud providers?

Joyent, Rackspace cloud, Storm on Demand, GoGrid and VoxCloud. There are certainly many others. Take a look at this Quora post: Most Reliable Cloud Providers.

o Tell one vendor management story.

Everyone who has managed operations, has worked with vendors at one point or another. For example if you’ve worked with Rackspace you know that it’s pretty easy to get a human on the line. Amazon on the other hand allows you to do-it-yourself for everything, and only later added on a support service option. So their service pattern and history are different.

o How do you troubleshoot a problems?

There isn’t really a right or wrong answer to this question, but it’s a nice starting point to discussion. It can also help illustrate a candidates communication skills, and how specifically they walk through solving a problem. What problem they choose as an illustration, and how they work through to a resolution is an important indicator of operations experience.

Pros and cons of Amazon versus Rackspace, configuration management & automation and cloud management solutions like Scalr and Rightscale… these and other skills are a important for a cloud deployment expert.

o What is puppet and chef?

Puppet is a configuration management system which allows ops teams to build templates for servers, and deploy many servers based on those templates. It further allows centralized control of configuration, to automate the management of a large number of servers.

Chef grew out of frustrations of Puppet, and is a sort of next generation configuration management system.

The term infrastructure as code may be thrown around. Since all cloud resources can be provisioned through API calls, everything in server deployment can be *theoretically* done via code, from spinup of servers, to installing packages, to configuring, code checkout, seeding databases and more.

o What are some of the pros and cons of configuration management for operations?

Pros include allowing a smaller team to automate the deployment of a large fleet of servers, standardization, and consistency. Cons include complexity when needing to do surgical, urgent changes, and complexity when coming into an existing environment that you’ve inherited.

o How is rightscale different? What does it provide?

Rightscale is a layer on top of your cloud provider. They provide a common interface and dashboard from which to deploy servers. Templating, automation, and multi-cloud support make it a great solution for teams that have less technical expertise on staff or less hands to manage things.

o How about scalr?

They’re another management solution, that supports multiple cloud providers. They offer templating, and auto-scaling too.

2.Day-to-day skills

o What type of programming experience do you have?

The answer is that every ops guy or girl should be able to code, just as every developer should have some basic operational experience. Should and does are often two different things, so ask for some examples.

o shell scripts

Bash, csh, Perl and Python are all part of the Linux administrators toolbox. Writing backup scripts, log rotation, automating routine tasks and so forth are all common needs of an operations expert.

Regular expressions are a part of Unix and used in scripting to search files, cronjobs, and ETL jobs. Ask for some basic examples.

o What is continuous integration?

The old model of code deployment was called waterfall, and allowed long careful planning, coding of new features, testing, and finally deployment. The cycle could take weeks or months and iterative change took a lot of time. Continuous integration also known as agile deployments, allows a much more frequent in some cases many times per day deployment of changes.

o What are metrics good for?

Just like in website visitor tracking, and business analytics, server level analytics and tracking is possible. Collecting server metrics such as load averages, memory, disk and cpu usage over time can be invaluable. When an application slows or server stalls, checking historical metrics can often quickly reveal problems or causes.

What are some examples? nagios, ganglia, cacti, munin, opennms

o What is unit testing?

This allows for software to be build in small testable compontents. When the compontents are coded, tests are also written that test whether they are operating properly, and whether dependencies are also installed and working.

Metrics, monitoring, load testing, firewalls, security & patching, Saas, Paas and IaaS there is a wide swath of skills needed to be competent as a web operations engineer. You’ve got your work cut out for you!

o What is load testing?

By performing some benchmarks, load testing can make estimates about how the application and code will perform when more users are hitting it.

o Security & networking

Sometimes a systems administrator is a generalized admin and sometimes there is a networking specialist on staff who doesn’t allow anyone else to touch that domain.

o What are firewall rules?

Unix services use port numbers to expose those services to the world. Since all servers on the internet are identified by IP addresses, firewall rules are defined around IP addresses or groups of them, and the ports they’re allowed to access.

o What is DNS?

DNS stands for domain name services. This is the sort of yellow pages of the internet. DNS allows a server name to be converted to it’s underlying IP address. It’s a very important service for any network, and generally includes many backup servers for when the primaries experience problems.

o What is a virtual private network?

A VPC provides a network link between a physical datacenter or your offices network, and your cloud provider. It allows you to elastically grow your existing datacenter using virtual resources, while treating those new boxes more like servers in your existing datacenter. IP addresses and subnets are controlled by your existing network rules and admins.

o Why is security important in web operations?

Since your business assets are primarily stored in digital form, the security of those assets depends on the security of your computer systems. Passwords, firewalls and encryption are all relevant.

o Why is patching software important?

Since security is a moving target, and vulnerabilities are constantly being discovered in software, patching and updates are important. Staying fairly current in applying patches means you network and systems will be more secure.

o What is intrusion detection?

Bugs in software open up vulnerabilities and ways into systems. Intrusion detection attempts to detect that such intrusions and avoid further damage.


o What is Saas – Software as a Service?

An example is dropbox, and other so-called hold-my-data type solutions fall into this category.

o What is Iaas – Infrastructure as a Service?

This is raw iron, the virtualized datacenters, hosting providers such as Amazon, GoGrid, Joyent, and Rackspace.

o What is Paas – platform as a service?

Solutions such as heroku, squarespace, wpengine and engineyard fall into this category. Some provide a platform such as the WordPress CMS, with arbitrary scaling options. Others like Heroku and EngineYard allow Ruby applications to be deployed without the need for a lot of fuss at the operational level.

We’re not done yet. In part three of this series, we’ll hit on dba skills, and a series of general questions that cut across the spectrum of web operations.


FIX system

The Financial Information eXchange (FIX) protocol is an electronic communications protocol initiated in 1992 for international real-time exchange of information related to thesecurities transactions and markets. With trillions of dollars traded annually on the NASDAQ alone, financial service entities are investing heavily in optimizing electronic trading and employing direct market access (DMA) to increase their speed to financial markets. Managing the delivery of trading applications and keeping latency low increasingly requires an understanding of the FIX protocol.

Diagrammatic representation of FIX system

Financial Information eXchange System Connectivity Diagram.svg


Enterprise Mobility using REST WebServices