A war story.

A war story

I once authored a COM+ service hosted in MTS for a large ecommerce site. This was late 1990’s when there was no service oriented architecture and distributed computing was just becoming mainstream.  This component enabled the ecommerce site to save quotes on their quote management system which was a mainframe by calling a specific program on the mainframe which was exposed at a particular IP address and port. The Quote management service could be called by multiple clients which were basically various segment of the online store. This component was developed and tested over a period of 3 months across various development and testing environments. This was deployed just before the start of the holiday season. The deployment went as per plan well into the evening. By about 9 PM the deployment was complete and unit tests run from one segment confirmed that the component worked as expected and was saving quotes on the mainframe. After what was a long day me and the rest of the team headed back home to catch up on sleep. At around 2 AM in the morning my phone stared buzzing with a message indicating that the component was failing to save quotes with certain calls running on a timeout. Slowly these timeouts escalated to a point where almost a quarter of all calls were timing out. I soon connected to check the logs and try and make sense of what was going wrong. There was not much information in there to indicate any issues. I turned on verbose logging at the risk of taking a small performance hit. Post this the component logs indicated that it had received some of the calls which the client team claimed to have made but it was also missing a lot of other calls. All the teams were assembled on a bridge and initially the teams started to pass the ball around with the component and the client teams indicating that the issue is on the other side. While the client’s logs indicated the call was made to the component at a particular time, the server log indicated either the call coming in at a much later point of time or never at all. All this while the customers on the site were unable to save any quotes and this was a major functionality since an enterprise segments workflow depended on quotes being created and approved before checkout.

As matters escalated a Sev 1 ticket was cut and incident and problem management was called in. Some of the guys in suits started calling in to check on when will it be resolved.The call to this component was put through a microscope and reviewed to check why the calls were missing on the server. Similarly, the server component was review to check for any reasons why we might not be acknowledging some calls. After various changes the server was rebooted and MTS restarted but still it did not work. The developers and Incident support in the room could not identify or isolate the issue since the behaviour was very random and only certain calls were not getting any response back or were getting a delayed response. One of the incident management folks came up with a suggestion to include network management. Once engaged these guys ran a network trace of some of the calls and finally after some time identified a rogue configuration on one of the network switches which was routing the calls trough to a wrong route leading to the call to server being lost. Once the configuration on the switch was corrected service returned to normal and 100% of calls were responded to. This simple network switch configuration issue took around 6 hours to solve and resulted in a rather large loss in revenue. It was 10 AM in the morning when service was normal and the ticket was closed.

Now the million-dollar question is, Could the development / deployment team have anticipated this issue. Could this issue have been resolved earlier. Is there anything in the design that could have averted this issue or enabled the teams to identify the cause faster ? While you are jumping to say this could have been identified and resolved easily , remember this was distributed computing in the late 1990’s.

Model view controller

Model-View-Controller (MVC) has been an important architectural pattern for many years now. It is a powerful and elegant means of separating concerns. You will find this pattern applied in multiple frameworks across different platforms and languages. For e.g Asp.net has an MVC implementation and Java has Spring MVC. On Mac Cocoa and UIKit also implement the MVC framework.

MVC separates the User Interface of an application into three main aspects

  • The Model – The model is a set of classes that describe the underlying data that is being worked with. It also has the business rules which govern the modification and manipulation of data. The model basically encapsulates the data and all its behaviour.
  • The view –  The view describes the the applications visual elements.
  • The Controller –  The controller mediates between the view and the model objects and acts as mode of communication between them and ties them together.




The traditional MVC pattern is a compound design pattern composed of several basic design patterns. These basic patterns work together to achieve functional separation and paths of communication.


Traditional version of MVC as a compound pattern


Composite – The view objects are designed as a composite of different view objects that work together in a coordinated fashion. User interaction can take place at any layer of the composite structure.

Strategy – The controller can follow various strategies to mediate between the view and model objects.

Observer – The model keeps the view objects informed of any changes to the data model by following the observer pattern.

Multiversion Concurrency Control

Multiversion Concurrency Control

Every developer I know of has used or uses databases on a daily basis, be it SQL Server or  Oracle but quite a few of them are stumped when asked how this SQL will execute and what would its output be.

insert into Employee select * from Employee; Select * from Employee;

Additionally what happens if multiple people execute this statement concurrently. Will the select lock the insert ? Will the insert lock the select ? How does the system maintain data integrity in this case ? The answers have ranged from, this is not a valid SQL statement to this will be an indefinite loop.

Some have also got it accurately but without an understanding of what happens in the background to be able to provide a consistent and accurate response to the above query. The situation is further complicated by the fact that the implementation which solves the above conundrum is referred to differently in the various database and transactional systems.

A simple way of solving the above problem is by using locks or latching. Using locks ensures data integrity but it also serializes all reads and writes. This approach is definitely not scalable since the lock will only allow a single transaction either read or write to happen. Locking has further evolved with ready only locks and other variations but is still inherently a concurrency nightmare. A better approach to effectively ensure data integrity and also ensure scalability is by using   Multiversion concurrency control pattern.

Multiversion Concurrency involves tagging an object with read and write timestamps. This way an access history is maintained for a data object. This timestamp information is maintained via change set numbers which are generally stored together with modified data in rollback segments. This enables multiple “point in time” consistent views based on the change set numbers stored.MVCC enables read-consistent queries and non-blocking reads by providing this “point in time” consistent views without the need to lock the whole table. In Oracle this change set number are called System change numbers commonly referred to as SCN’s.This is one of the most important scalability features for a database since this provides for maximum concurrency while maintaining read consistency.

Internet of things

Microsoft recently announced that Visual Studio will start supporting device development and started by initially supporting Intel’s Galileo board.This is not entirely new to Visual Studio. One of the earliest add on’s to support board based development was visual micro which supported the Arduino board.

Arduino is an open-source physical computing platform based on a simple i/o board and a IDE integrated development environment that implements the Processing/Wiring language. Arduino can be used to develop stand-alone interactive objects and installation or can  be connected to software on your computer. The first Arduino board was produced on January 2005. The project is the result of the collaboration of many people primarily David Cuartielles and Massimo Banzi who taught physical computing. The others would be David Mellis, who built software engine , Tom Igoe and Gianluca Martino.



Board Layout

Arduino board layout


Starting clockwise from the top center:

• Analog Reference pin (orange)

• Digital Ground (light green)

• Digital Pins 2-13 (green)

• Digital Pins 0-1/Serial In/Out – TX/RX (dark green)

- These pins cannot be used for digital i/o (digitalRead and digitalWrite) if you are also using serial communication (e.g. Serial.begin).

• Reset Button – S1 (dark blue)

• In-circuit Serial Programmer (blue-green)

• Analog In Pins 0-5 (light blue)

• Power and Ground Pins (power: orange, grounds:light orange)

• External Power Supply In (9-12VDC) – X1 (pink)

• Toggles External Power and USB Power (place jumper on two pins closest to desired supply) – SV1 (purple)

• USB (used for uploading sketches to the board and for serial communication between the board and the computer; can be used to  power the board) (yellow)

Installing Arduino software on Your Computer :

To program the Arduino board, you must first download the development environment (the IDE) from www.arduino.cc/en/Main/Software. Choose the right version for your operating system. Post this you need to install the drivers that allow your computer to talk to your board through the USB port.

Installing Drivers: Windows

Plug the Arduino board into the computer; when the Found New Hardware Wizard window comes up, Windows will first try to find the driver on the Windows Update site.Windows Vista will first attempt to find the driver on Windows Update; if that fails, you can instruct it to look in the Drivers\FTDI USB Drivers folder.You’ll go through this procedure twice, because the computer first installs the low-level driver, then installs a piece of code that makes the board look like a serial port to the computer. Once the drivers are installed, you can launch the Arduino IDE and start using Arduino.

Arduino Compiler


Arduino Programming language

The basic structure of the Arduino programming language is fairly simple and runs in at least two parts. These two required parts, or functions, enclose blocks of statements.

void setup()
void loop()

Where setup() is the preparation, loop() is the execution. Both functions are required for the program to work.

The setup function should follow the declaration of any variables at the very beginning of the program. It is the first function to run in the program, is run only once, and is used to set pinMode or initialize serial communication.

The loop function follows next and includes the code to be executed continuously – reading inputs, triggering outputs, etc. This function is the core of all Arduino programs and does the bulk of the work.


A function is a block of code that has a name and a block of statements that are executed when the function is called.

Custom functions can be written to perform repetitive tasks and reduce clutter in a program. Functions are declared by first declaring the function type. This is the type of value to be returned by the function such as ‘int’ for an integer type function. If no value is to be returned the function type would be void. After type, declare the name given to the function and in parenthesis any parameters being passed to the function.

type functionName(parameters)






A variable is a way of naming and storing a value for later use by the program. As their namesake suggests, variables can be continually changed as opposed to constants whose value never changes.A variable can be declared in a number of locations throughout the program and where this definition takes place determines what parts of the program can use the variable.

Variable scope

A variable can be declared at the beginning of the program before void setup(), locally inside of functions, and sometimes within a statement block such as for loops. Where the variable is declared determines the variable scope, or the ability of certain parts of a program to make use of the variable.

A global variable is one that can be seen and used by every function and statement in a program. This variable is declared at the beginning of the program, before the setup() function.

A local variable is one that is defined inside a function or as part of a for loop. It is only visible and can only be used inside the function in which it was declared. It is therefore possible to have two or more variables of the same name in different parts of the same program that contain different values. Ensuring that only one function has access to its variables simplifies the program and reduces the potential for programming errors.

Support Vector Machines (SVM)


Snooker SVM

Support vector machines (SVMs) are powerful tools for data classification. Support vector machines (SVMs) are used for classification of both linear and nonlinear data. Classification is achieved by a linear or nonlinear mapping to transform the original training data into a higher dimension. Within this new dimension, it searches for the linear optimal separating hyperplane (i.e. a “decision boundary” separating the tuples of one class from another). With an appropriate nonlinear mapping to a sufficiently high dimension, data from two classes can always be separated by a hyperplane. The SVM finds this hyperplane using support vectors (“essential” training tuples) and margins (defined by the support vectors).SVM’s  became famous when, using images as input, it gave accuracy comparable to neural-network with hand-designed features in a handwriting recognition task.

Support vector machines select a small number of critical boundary instances called
support vectors from each class and build a linear discriminant function that separates
them as widely as possible. This instance-based approach transcends the limitations
of linear boundaries by making it practical to include extra nonlinear terms in
the function, making it possible to form quadratic, cubic, and higher-order decision

SVM’s are based on an algorithm that finds a special kind of linear model: the maximum-margin hyperplane. The maximum-margin hyperplane is the one that gives the greatest separation between the classes—it comes no closer to either than it has to.

Using a mathematical transformation, it moves the original data set into a new mathematical space in which the decision boundary is easy to describe. Because this transformation depends only on a simple computation involving “kernels,” this technique is called the kernel trick. kernel can be set to one of four values: linear, polynomial, radial and sigmoid.

SVMs were developed by Cortes & Vapnik (1995) for binary classification.

Class separation: We are looking for the optimal separating hyperplane between the two classes by maximizing the margin between the classes’ closest points |the points lying on the boundaries are called support vectors, and the middle of the margin is our optimal separating hyperplane.

Overlapping classes: Data points on the “wrong” side of the discriminant margin are weighted down to reduce their influence ( “soft margin” ).

Nonlinearity: when we cannot find a linear separator, data points are projected into an (usually) higher-dimensional space where the data points effectively become linearly separable (this projection is realised via kernel techniques ).
Problem solution: the whole task can be formulated as a quadratic optimization problem which can be solved by known techniques.

A program able to perform all these tasks is called a Support Vector Machine.

Graphing my Facebook network – Social Network Analysis

Pradeep Facebook Network

Netvizz is a cool tool which can export your Facebook data in a format which makes it easier to visualize or do a network analysis offline. There are other apps available on the Facebook platform which perform this using predefined bounds within the app itself but what I like about Netvizz is that it allows you to extract this information and play with it using any tool you are comfortable with like R, Gephi etc. This sort of visualization is a core of Social network analysis systems.
I spent some time to play around with my Facebook network information over the weekend. I extracted out my network using Netvizz. It gives a GDF file with all the necessary information and can be imported into Gephi directly. The idea was to see how much the Facebook visualization of my friend network compares with my real life network. I do understand that the online world and the offline world social networks differ but on a massive system like Facebook my guess was that it should mirror my real life network much more realistically. I built the network and used the community detection algorithm and did a ForceAtlas2 layout. This is the network visualization I ended up with after tweaking a few things. The network diagram is both surprising and a bit scary.

Accurate reflection of my Social network groups: The network analysis shows the various groups of friends by my life events and their relationships. The green bunch of circles on the right are my friends from my ex company Logica. The small red bunch below is my network of friends from IIM-Lucknow and the big blob of circles on the top left are my friends from Dell. There were other individual circles in-between too which were removed because I filtered by degree of connectedness and represented friends who were not part of any of these big networks and were people with whom I was connected outside of all these bigger networks
The network information on Facebook is more or less an accurate reflection of my networks in real life in terms of how I could group them by connectedness or with specific timelines of my life.

Not so Accurate reflection of my Social network groups: My assumptions of who were the best networkers was shattered when I could see that the some of the people in the top 10 in degree of connectedness were actually people who are more of introverts. Maybe the online world provides them a degree of comfort in connecting with people. Another surprise was that some of my friends in these disparate networks were also connected with each other. This can be seen by the 2 dots at the bottom connecting the bigger blob to them. Another aspect was that this is still not an accurate reflection of my actual social life and does not in any way reflect my actual day to day social interactions.

Inference on Connectedness: The connected inference was also surprising since it was an accurate reflection on the groups of people and their interactions. I was pretty surprised on how even within my current organization the connectedness numbers could pinpoint teams. This is reflected in the big blob on the top right which has 3 distinct colors representing the 3 large divisions I have worked across in my organization.

I will be playing around a lot more with this data and I am planning a further analysis on my LinkedIn network and overlay it onto my Facebook network to see how it compares.

Strategy Execution

Strategy helps an organization position itself uniquely in order to create value and exploit the unfolding opportunities , hedge against threats, leverage strengths and remove weaknesses.

Strategy execution has two components namely the action planning to operationalize the core strategy and its physical implementation. Strategy execution is one integrated in which action planning and physical implementation are integrated in real time basis.  High quality execution presupposes high quality action planning , assigning the right people for the right jobs, adequate follow through and robust MIS and review processes. Strategy execution involves the optimal balancing of seven key variables namely Strategy, Structure, Systems , Staff, Skills, Style and Shared values. Strategy execution is the key role of the business leader. This execution is a discipline and must be learned specifically. The primary areas are

  • Project Management
  • Scheduling
  • Cost Management
  • People Management
  • Time Management
  • Interpersonal skills
  • Negotiation and Conflict resolution
  • Problem solving skills
  • Communication skills
  • Relationship management
  • Risk management
  • Balance Scorecard

The four building blocks of Strategy execution are

  • Information flow
  • Decision Rights
  • Right motivators
  • Right Structure


Information flow plays a key role in ensuring that information about competitive environment is received at the appropriate time at all appropriate levels and that information flows freely across organization boundaries. It is also critical to ensure that field and line employees have the information they need to understand bottom line impact of their day to day choices, Line managers have access to metrics they need to measure key drives of their business and that conflicting messages are rarely sent.

Decision rights ensures that everyone has a good idea of the decisions and the actions for which they are responsible and senior managers have a good feel about the operating decisions being made. The decision making culture in an organization is one of persuasion and consensus rather than command and control