Multiversion Concurrency Control

Multiversion Concurrency Control

Every developer I know of has used or uses databases on a daily basis, be it SQL Server or  Oracle but quite a few of them are stumped when asked how this SQL will execute and what would its output be.

insert into Employee select * from Employee; Select * from Employee;

Additionally what happens if multiple people execute this statement concurrently. Will the select lock the insert ? Will the insert lock the select ? How does the system maintain data integrity in this case ? The answers have ranged from, this is not a valid SQL statement to this will be an indefinite loop.

Some have also got it accurately but without an understanding of what happens in the background to be able to provide a consistent and accurate response to the above query. The situation is further complicated by the fact that the implementation which solves the above conundrum is referred to differently in the various database and transactional systems.

A simple way of solving the above problem is by using locks or latching. Using locks ensures data integrity but it also serializes all reads and writes. This approach is definitely not scalable since the lock will only allow a single transaction either read or write to happen. Locking has further evolved with ready only locks and other variations but is still inherently a concurrency nightmare. A better approach to effectively ensure data integrity and also ensure scalability is by using   Multiversion concurrency control pattern.

Multiversion Concurrency involves tagging an object with read and write timestamps. This way an access history is maintained for a data object. This timestamp information is maintained via change set numbers which are generally stored together with modified data in rollback segments. This enables multiple “point in time” consistent views based on the change set numbers stored.MVCC enables read-consistent queries and non-blocking reads by providing this “point in time” consistent views without the need to lock the whole table. In Oracle this change set number are called System change numbers commonly referred to as SCN’s.This is one of the most important scalability features for a database since this provides for maximum concurrency while maintaining read consistency.

Internet of things

Microsoft recently announced that Visual Studio will start supporting device development and started by initially supporting Intel’s Galileo board.This is not entirely new to Visual Studio. One of the earliest add on’s to support board based development was visual micro which supported the Arduino board.

Arduino is an open-source physical computing platform based on a simple i/o board and a IDE integrated development environment that implements the Processing/Wiring language. Arduino can be used to develop stand-alone interactive objects and installation or can  be connected to software on your computer. The first Arduino board was produced on January 2005. The project is the result of the collaboration of many people primarily David Cuartielles and Massimo Banzi who taught physical computing. The others would be David Mellis, who built software engine , Tom Igoe and Gianluca Martino.

 

Arduino

Board Layout

Arduino board layout

 

Starting clockwise from the top center:

• Analog Reference pin (orange)

• Digital Ground (light green)

• Digital Pins 2-13 (green)

• Digital Pins 0-1/Serial In/Out – TX/RX (dark green)

- These pins cannot be used for digital i/o (digitalRead and digitalWrite) if you are also using serial communication (e.g. Serial.begin).

• Reset Button – S1 (dark blue)

• In-circuit Serial Programmer (blue-green)

• Analog In Pins 0-5 (light blue)

• Power and Ground Pins (power: orange, grounds:light orange)

• External Power Supply In (9-12VDC) – X1 (pink)

• Toggles External Power and USB Power (place jumper on two pins closest to desired supply) – SV1 (purple)

• USB (used for uploading sketches to the board and for serial communication between the board and the computer; can be used to  power the board) (yellow)

Installing Arduino software on Your Computer :

To program the Arduino board, you must first download the development environment (the IDE) from www.arduino.cc/en/Main/Software. Choose the right version for your operating system. Post this you need to install the drivers that allow your computer to talk to your board through the USB port.

Installing Drivers: Windows

Plug the Arduino board into the computer; when the Found New Hardware Wizard window comes up, Windows will first try to find the driver on the Windows Update site.Windows Vista will first attempt to find the driver on Windows Update; if that fails, you can instruct it to look in the Drivers\FTDI USB Drivers folder.You’ll go through this procedure twice, because the computer first installs the low-level driver, then installs a piece of code that makes the board look like a serial port to the computer. Once the drivers are installed, you can launch the Arduino IDE and start using Arduino.

Arduino Compiler

 

Arduino Programming language

The basic structure of the Arduino programming language is fairly simple and runs in at least two parts. These two required parts, or functions, enclose blocks of statements.

void setup()
{
    //statements
}
void loop()
{
     //statements
}

Where setup() is the preparation, loop() is the execution. Both functions are required for the program to work.

The setup function should follow the declaration of any variables at the very beginning of the program. It is the first function to run in the program, is run only once, and is used to set pinMode or initialize serial communication.

The loop function follows next and includes the code to be executed continuously – reading inputs, triggering outputs, etc. This function is the core of all Arduino programs and does the bulk of the work.

Functions

A function is a block of code that has a name and a block of statements that are executed when the function is called.

Custom functions can be written to perform repetitive tasks and reduce clutter in a program. Functions are declared by first declaring the function type. This is the type of value to be returned by the function such as ‘int’ for an integer type function. If no value is to be returned the function type would be void. After type, declare the name given to the function and in parenthesis any parameters being passed to the function.

type functionName(parameters)

{

statements;

}

 

Variables

A variable is a way of naming and storing a value for later use by the program. As their namesake suggests, variables can be continually changed as opposed to constants whose value never changes.A variable can be declared in a number of locations throughout the program and where this definition takes place determines what parts of the program can use the variable.

Variable scope

A variable can be declared at the beginning of the program before void setup(), locally inside of functions, and sometimes within a statement block such as for loops. Where the variable is declared determines the variable scope, or the ability of certain parts of a program to make use of the variable.

A global variable is one that can be seen and used by every function and statement in a program. This variable is declared at the beginning of the program, before the setup() function.

A local variable is one that is defined inside a function or as part of a for loop. It is only visible and can only be used inside the function in which it was declared. It is therefore possible to have two or more variables of the same name in different parts of the same program that contain different values. Ensuring that only one function has access to its variables simplifies the program and reduces the potential for programming errors.

Support Vector Machines (SVM)

 

Snooker SVM

Support vector machines (SVMs) are powerful tools for data classification. Support vector machines (SVMs) are used for classification of both linear and nonlinear data. Classification is achieved by a linear or nonlinear mapping to transform the original training data into a higher dimension. Within this new dimension, it searches for the linear optimal separating hyperplane (i.e. a “decision boundary” separating the tuples of one class from another). With an appropriate nonlinear mapping to a sufficiently high dimension, data from two classes can always be separated by a hyperplane. The SVM finds this hyperplane using support vectors (“essential” training tuples) and margins (defined by the support vectors).SVM’s  became famous when, using images as input, it gave accuracy comparable to neural-network with hand-designed features in a handwriting recognition task.

Support vector machines select a small number of critical boundary instances called
support vectors from each class and build a linear discriminant function that separates
them as widely as possible. This instance-based approach transcends the limitations
of linear boundaries by making it practical to include extra nonlinear terms in
the function, making it possible to form quadratic, cubic, and higher-order decision
boundaries.

SVM’s are based on an algorithm that finds a special kind of linear model: the maximum-margin hyperplane. The maximum-margin hyperplane is the one that gives the greatest separation between the classes—it comes no closer to either than it has to.

Using a mathematical transformation, it moves the original data set into a new mathematical space in which the decision boundary is easy to describe. Because this transformation depends only on a simple computation involving “kernels,” this technique is called the kernel trick. kernel can be set to one of four values: linear, polynomial, radial and sigmoid.

SVMs were developed by Cortes & Vapnik (1995) for binary classification.

Class separation: We are looking for the optimal separating hyperplane between the two classes by maximizing the margin between the classes’ closest points |the points lying on the boundaries are called support vectors, and the middle of the margin is our optimal separating hyperplane.

Overlapping classes: Data points on the “wrong” side of the discriminant margin are weighted down to reduce their influence ( “soft margin” ).

Nonlinearity: when we cannot find a linear separator, data points are projected into an (usually) higher-dimensional space where the data points effectively become linearly separable (this projection is realised via kernel techniques ).
Problem solution: the whole task can be formulated as a quadratic optimization problem which can be solved by known techniques.

A program able to perform all these tasks is called a Support Vector Machine.

Graphing my Facebook network – Social Network Analysis

Pradeep Facebook Network

Netvizz is a cool tool which can export your Facebook data in a format which makes it easier to visualize or do a network analysis offline. There are other apps available on the Facebook platform which perform this using predefined bounds within the app itself but what I like about Netvizz is that it allows you to extract this information and play with it using any tool you are comfortable with like R, Gephi etc. This sort of visualization is a core of Social network analysis systems.
I spent some time to play around with my Facebook network information over the weekend. I extracted out my network using Netvizz. It gives a GDF file with all the necessary information and can be imported into Gephi directly. The idea was to see how much the Facebook visualization of my friend network compares with my real life network. I do understand that the online world and the offline world social networks differ but on a massive system like Facebook my guess was that it should mirror my real life network much more realistically. I built the network and used the community detection algorithm and did a ForceAtlas2 layout. This is the network visualization I ended up with after tweaking a few things. The network diagram is both surprising and a bit scary.

Accurate reflection of my Social network groups: The network analysis shows the various groups of friends by my life events and their relationships. The green bunch of circles on the right are my friends from my ex company Logica. The small red bunch below is my network of friends from IIM-Lucknow and the big blob of circles on the top left are my friends from Dell. There were other individual circles in-between too which were removed because I filtered by degree of connectedness and represented friends who were not part of any of these big networks and were people with whom I was connected outside of all these bigger networks
The network information on Facebook is more or less an accurate reflection of my networks in real life in terms of how I could group them by connectedness or with specific timelines of my life.

Not so Accurate reflection of my Social network groups: My assumptions of who were the best networkers was shattered when I could see that the some of the people in the top 10 in degree of connectedness were actually people who are more of introverts. Maybe the online world provides them a degree of comfort in connecting with people. Another surprise was that some of my friends in these disparate networks were also connected with each other. This can be seen by the 2 dots at the bottom connecting the bigger blob to them. Another aspect was that this is still not an accurate reflection of my actual social life and does not in any way reflect my actual day to day social interactions.

Inference on Connectedness: The connected inference was also surprising since it was an accurate reflection on the groups of people and their interactions. I was pretty surprised on how even within my current organization the connectedness numbers could pinpoint teams. This is reflected in the big blob on the top right which has 3 distinct colors representing the 3 large divisions I have worked across in my organization.

I will be playing around a lot more with this data and I am planning a further analysis on my LinkedIn network and overlay it onto my Facebook network to see how it compares.

Strategy Execution

Strategy helps an organization position itself uniquely in order to create value and exploit the unfolding opportunities , hedge against threats, leverage strengths and remove weaknesses.

Strategy execution has two components namely the action planning to operationalize the core strategy and its physical implementation. Strategy execution is one integrated in which action planning and physical implementation are integrated in real time basis.  High quality execution presupposes high quality action planning , assigning the right people for the right jobs, adequate follow through and robust MIS and review processes. Strategy execution involves the optimal balancing of seven key variables namely Strategy, Structure, Systems , Staff, Skills, Style and Shared values. Strategy execution is the key role of the business leader. This execution is a discipline and must be learned specifically. The primary areas are

  • Project Management
  • Scheduling
  • Cost Management
  • People Management
  • Time Management
  • Interpersonal skills
  • Negotiation and Conflict resolution
  • Problem solving skills
  • Communication skills
  • Relationship management
  • Risk management
  • Balance Scorecard

The four building blocks of Strategy execution are

  • Information flow
  • Decision Rights
  • Right motivators
  • Right Structure

 

Information flow plays a key role in ensuring that information about competitive environment is received at the appropriate time at all appropriate levels and that information flows freely across organization boundaries. It is also critical to ensure that field and line employees have the information they need to understand bottom line impact of their day to day choices, Line managers have access to metrics they need to measure key drives of their business and that conflicting messages are rarely sent.

Decision rights ensures that everyone has a good idea of the decisions and the actions for which they are responsible and senior managers have a good feel about the operating decisions being made. The decision making culture in an organization is one of persuasion and consensus rather than command and control

 

Motivators

 

Structure

Concept of Strategy and Strategy Process

In the new business environment customers are not just demanding but they also have infinite options. Competition is not just intensifying but there are new sources of competition. Resources too are not just limited but they are fluid and move fast. Investors are impatient and look at above average returns consistently and constantly. Huge opportunities coexist with massive risks. An organization must think and act differently and smartly in order to face the new environmental context.  Strategy derived through the strategy process is what makes the organization stand apart from ”also rans” and perform differently.

Strategic decisions are Context specific and indicate long term direction of an organization. These decisions define the context specific sustainable competitive advantage of the organization. It establishes the scope of the organization and the strategic fit between resources , activities and environmental contingencies. Strategic decisions aim to  exploit existing capabilities or explore new opportunities and requires major resource commitments and is irreversible in the short run. It affects all organizational decisions and is shaped by values and expectations of key stake holders with the primary aim of wealth creation. It specifically helps answer three questions for each business in the portfolio namely

  1. Where to Compete? ( Market selection )
  2. How to win orders? ( Build Competitive advantage)
  3. How to deliver? ( Value delivery process )

Strategic decisions are complex in nature and require a vision for the future. It demands an integrated approach to managing an organization. It requires a change within the organization and management of relationships and networks outside the organization.

There are different types of strategy namely

  1. Corporate / Portfolio Strategy : Portfolio strategy also called corporate strategy involves portfolio selection  considering synergy and risk return criterion. It decides resource allocation ( Organizational, Financial, people etc. ) across businesses.
  2. Business Strategy : Business strategy is a set of decisions that helps a business position itself uniquely and distinctly.
  3. Functional Strategy : Functional strategies includes strategies and programs at functional level to operationalize the overall strategy and align them to each other.
  4. Operational Strategy : Operational strategy includes action plans and programs at the frontlines of an organization where strategy gets operational.

External Environment, Opportunities, Threats, Industry competition and Competitor analysis.

External environment: General, Industry and Competitor.

The external environment includes the areas of General, Industry and Competitor environment. The general environment is the broader society dimensions that influence an industry and the firms within it. It is grouped into 7 dimensions or ‘environmental segments’ which cannot be controlled or manipulated. However segment intelligence of each of these can help reorient strategy to mitigate influence in the long term.

clip_image001[1]

The industry environment is a set of factors which directly influence a firm’s competitive actions and responses. These factors can be analysed using Porters Five forces model. Competitor Analysis is used to gather and interpret competitor information. The Competitor environment gives information about a firms direct and indirect competitors and the competitive dynamics expected to impact a firms efforts to generate an above average return.

External Environment Analysis.

An Opportunity is a general environment condition that is exploited helps a company achieve strategic competitiveness. A threat is a general environmental condition that may hinder a company’s efforts to achieve strategic competitiveness. There are four components of external environmental analysis namely

  • Scanning: is the process of identifying early signal of environment changes and trends.
  • Monitoring: is the process of detecting meaning through ongoing observation of environmental changes and trends obtained through scanning.
  • Forecasting: is the process of developing projections of anticipated outcomes based on monitored changes and trends.
  • Assessing: is the process of determining the timing and importance of environmental changes and trends that impact a firms strategies and their management.

clip_image003[1]

 

Industry environmental analysis

An industry is defined as a group of firms producing products that are close substitutes the industry environment has a higher impact on a firm’s general competitiveness and ability to earn above average return compared to the general environment. The intensity of competition and profit potential are a function of Porters Five forces analysis.

clip_image005[1]

Porters Five Forces

  1. Threat of new Entrants: New entrants can threaten market share of existing competitors. It brings additional production capacity to the industry. This is a function of two factors namely
      1. Barriers to Entry : Economies of scale, Product differentiation, Capital requirements, Switching costs, Access to distribution channels, Cost disadvantage, Government policy are the various barriers to entry faced by a new entrant into an industry.
      2. Expected Retaliation: An expectation of vigorous and swift retaliation reduces the likelihood of entry. Retaliation is generally vigorous when the existing firm has a large stake in the industry, invested substantial resources and when industry growth is slow.
  2. Bargaining Power of Suppliers: Suppliers can exercise their power by reducing quality or increasing price. Suppliers are powerful when there are very few large suppliers and are they are more concentrated than the industry they sell to, there are no substitutes for the supplier’s product, the firms are not a significant customer to the supplier group, the supplier’s goods are critical to a buyer’s success, there is a high switching cost due to effectiveness of a suppliers products. and ther exists a threat of forward integration.
  3. Bargaining power of buyers: Buyers want to buy at the lowest price and demand higher levels of service at the best quality. They are powerful when they purchase a large proportion of the industries output. The products sales accounts for a significant portion of the seller’s annual revenue. The industries products are undifferentiated and standardized raising the threat of backward integration.
  4. Threat of substitute products: is the threat when goods or services outside of the given industry perform the same or similar functions at a competitive price or have low switching costs. Product and service differentiation helps overcome the threat of substitute products. E.g. Plastic has replaced steel and other materials in many applications at a very competitive price and value preposition.
  5. Intensity of rivalry among competing firms: The intensity of rivalry in an industry is the extent to which competitors within an industry compete with one another and limit other profit potential. If rivalry is fierce the profit potential in the industry declines for all firms. Low intensity of rivalry increases profit potential and makes the industry less competitive. Intensity of Rivalry is high if
        • Competitors are numerous
        • Competitors have equal size
        • Competitors have equal market share
        • Industry growth is slow
        • Fixed costs are high
        • Products are undifferentiated
        • Brand loyalty is insignificant
        • Consumer switching costs are low
        • Competitors are strategically diverse
        • There is excess production capacity
        • Exit barriers are high

Intensity of Rivalry is Low if

        • Competitors are few
        • Competitors have unequal size
        • Competitors have unequal market share
        • Industry growth is fast
        • Fixed costs are low
        • Products are differentiated
        • Brand loyalty is significant
        • Consumer switching costs are high
        • Competitors are not strategically diverse
        • There is no excess production capacity
        • Exit barriers are low

Competitor Analysis

Competitor intelligence is the data and information that a firm gathers to better understand and anticipate its competitor’s objectives, strategies, assumptions and capabilities.When gathering competitive intelligence firms must pay specific attention to complementors who add value to the focal firms products and strategies. E.g. Microsoft and Intel are complementors.Competitor Intelligence collection needs to follow ethical practices which can be through obtaining and analysing public information or attending trade fairs, obtaining brochures etc.