Product Development: Design is more important than you think

Programming is just another tool like hammer or plier the only difference is that lot of it is mental while lot of work with hammer is physical. But this doesn’t mean that we should be using U-shaped hammer, that would be really stupid. The reason behind the design of the hammer is Force that can be produced by swinging it and it doesn’t require complexity, a simple piece of wood and iron head works. Programming should also be like that, having reasonable design.

In my early days, projects used to have messy programs but nevertheless they would just work fine. A function here and a function there with arbitrary code inside them that would just work. At that time, goal wasn’t writing the best program but to make the project work. The feeling that I have built something. As I have practiced more and more and built something for longer term made me realise that program should also have design. It should be simple and follow a pattern.

Patterns are like musical notes, once you get hold of few you can imagine what can come next.

See the picture at the top, that’s what most programs look like. We all write something which ends up looking like that. Functions getting called and written at arbitrary places, passing data and modifying at unknown places and sometimes there is a legacy code which just doesn’t let you write improved version because it’s just too much work. Hence, design is important. In programming, it is called design pattern.

Design pattern helps in structuring of code in such a way that if you know how one piece of the code works, rest can be figured out. They are inspired by real-life structures and manufacturing system. Visualising code as a physical system is of great help as brain can easily consume the information and can draw parallels. Design patterns are available for all languages (google “<language> design pattern”).

The other paradigm of programming is anti-pattern. Anti-pattern is a solution that seems to be working but is counter-productive. When working on a solution like this, it can be perceived that the solution can be easily implemented but when approached it becomes highly complex, confusing and probably inefficient. A programmer must study both aspects of the programming language, not only will this help in writing better code but in framing better solutions. The complexity of solution is often related to the design we have in our head and having structured thoughts leads to better and quick implementation.

The other reason why our programs should have a good design is the cost of maintenance. It may seem like that the cost of writing code that works is less but over the period it will end up costing more. Once there is a bad code in the system, it will continuously need fixes and rework which ends up costing more than it would have otherwise. The cost is not only money but time as well, which is far more important. But you should not go to the other extreme as well and write the great code because making it perfect is nearly impossible, it will always have shortcomings. A good code is which anyone can read and doesn’t require frequent maintenance.

To read more: Source making or Design Patterns by gang of four for languages like C++, Java, Python etc.

Languages: Best way to learn the concepts

If one wants to pioneer software engineering, they must learn languages.

Computer languages are similar to languages that we speak. People who speak multiple languages are generally good at understanding complex things because of the rich vocabulary. The more languages a person knows, wider is their horizon. It has been scientifically proven that learning languages rewires the brain and improves the memory. The same can be said about programming languages. Every language has its own paradigm and concept that its built on. Even if a person doesn’t want to use the language, I believe they must read about it. It’s the best way to understand different concepts, principles and constructs and to come up with a new one.

Every programming language is built on an idea and more often than not, it is unique. Programming languages are usually built on new fundamental ideas about software engineering and solve certain types of problem really well. That’s why functional programming languages are different from object-oriented from procedural languages. One can take principles from these languages and re-construct in the language of their choice if they want to, for example now many regular programming languages support functionalities of functional languages.

After learning few languages I have stopped thinking in terms of languages. I use my understanding of languages to map solutions of a problem. I then try to find the constructs which can help me in implementation in the easiest way. Easiest way may not be a best way but then that is a trade-off that one has to take while building a product.

More often than not, one doesn’t need to build the best product but only the product that works.

Once a person get hold of few languages, it also become easy for them to learn new languages because the major part of any language is redundant. It’s only the new concepts that take time and they are not many usually. Often the learning curve is not steep if the language is of the same paradigm that you are used to. Usually a person doesn’t need to work on more than 2-3 languages at a time and that is enough to implement the foreign concepts from the different languages. Although, it takes time to reach that stage but it is achievable. Hope to see you there!

 

Product Management: It’s all about communication

giphy

For larger part of my contribution as an engineer I have been an individual contributor. And straight out of engineering I got a team of my own about 18 months ago. Looking back in time now, this GIF explains exactly how I feel about the period. Not that that I was dumb but that I was so involved in what I was doing that I usually forget to see zoomed out picture. But that’s not it, there is other side of this story as well.

In such short stint of working with the team I have learned many things about how things are in my head and how they actually are. Most important thing that I learnt is that:

Developers can only be as good as their managers are.

What I mean is, that a developer cannot build things better than what is communicated to them. The kid in the GIF did exactly what was communicated to him and that’s what developers do as well. And the worst part of it is that coach is wrong and in my case I was. If I am going to do shoddy job at communicating the problem, the work that will be delivered will be equivalently shoddy and with no fault of the programmer. It simply is not the programmer’s job to interpret what is being communicated, everything that has to be said has to be in absolute terms.

It is easy(relatively) to keep everyone on same page when a team is small and as the number of people grow the communication starts to become jittery, information loss happens. This law is valid for all kinds of communication be it digital communication or man-to-man. I have an equation which can tell how well informed a person is who is n communicating points away from the origin of the communication:

Effectiveness ∝ (language skills of speaker x listener’s grasping skill) ^ n

Here, both language skills of speaker and listener’s grasping skill are in the range of 0 to 1 which means the effectiveness of the communication is always going to be lower than understanding of the person who is trying to communicate. Hence, the person who understands the best is the one who is initiating the communication process and the first loss of information also happens at this point. Human brains stores information in complex form. A narration in the head can comprise of words, pictures, emotions/feeling and moments. Hence, when a person tries to translate the information in the form of words there is a loss of information.

Coming back to the point. If a developer in the team is producing things which is not what is expected, it is only because they have not been communicated well. In order to improve the effectiveness of the communication, it has to be done over and over. The number of times it is repeated, better the understanding of the situation is in listener’s head. Why? Because every time the communication happens our brain automatically improvises on what is to be communicated and add more things to it, the tiny details that we forget or neglect otherwise. A good way to experiment with this idea is to play Pictionary.

This is important for both team members and team lead. Effort has to be made from both directions. I hope this post helps you in being a better team player! 🙂

Cassandra: What it is and what not

Recently I had a chance to work on the Cassandra. To explain the need in short, it was required to have a distributed key-value store. While Redis is great but it doesn’t let you have multiple geographically distributed writable servers but Cassandra does. Writing here few points about Cassandra and so that one can keep them in the back of the head while setting it up.

  • Contrary to what I said, Cassandra is not exactly key-value storage. It is more of a JSON format storage which can behave like key-value pair.
  • CQL – Cassandra Query Language distracts the user from the true nature of Cassandra. It makes one believe that Cassandra is like RDBMS while its not. One must think of it as key-value, where values can be extended further.
  • Data structures provided by Redis can be implemented easily.
  • Throughput are depended on configuration of machine. More the number of CPU, better the throughput can be. Go through other configurations as well, best practices are explained in configuration file.
  • Data is first written in memory(memtables) and on disc(sstables) during compaction, which depends on the settings. By default it is set to 10 days or Java heap size in memory whichever is reached first.
  • Lesser the tombstones, faster will be the compaction. This way, probability of reading sstables, which is disk read, go low.
  • Updates are waste of resource use insert instead, it will overwrite the existing row/document on compaction.
  • To keep the read fast use quorum as 1, that is whatever you are getting on selection is the truth.
  • Partitioning key should be designed as such to keep the data related to it on a single node. IMO partitioning key is analogous to tables in RDBMS. This way for a particular key and quorum equals to 1 will always be truth. See composite partitioning keys.
  • Latency is a thing to worry about. Try to keep reads as low as possible.
  • Schema structure is very important. It is important to learn about the queries one needs to make on the table before writing schema. Unlike RDBMS, condition has to be on successive columns instead of random columns.
  • If using PHP, use Java-client with php-java bridge instead of native PDO driver. It provides almost 3x read/write throughput per node.
  • IRC is good place to get help in case of issues.
  • If nodes are EC2 instances, snitch configured for EC2 is available for used.
  • Version 2.0.9 is not compatible with 2.1.x in a cluster.
  • In general, version below x.x.5 are not production ready and have serious bugs. Current 2.1.1 is not suitable for production environment.

Natural Language Processing – Terminology

Artificial Intelligence has lots of application. One of which is Natural Language Processing or in short NLP. NLP is the study or processing of natural texts to find the interesting patterns or details in natural texts for other purposes like tagging a post automatically. NLP contrary to the general belief, as being very advance computer science, started in early 1950s.

There are various applications of NLP like machine translation(eg: Google Translate), Part of Speech tagging(eg: tagging a word whether it is a noun or verb etc. ), automatic summarization(eg: Text Compactor), natural language generation(eg: Alice, a software that interacts with human.) etc. You can read about it more here.

In this post, I am explaining the terminologies used in NLP.

  • Natural Text
    Natural text is any text which is human readable in general. All languages that human writes are natural language can used for language processing.
    Eg: English, Hindi, French
  • Labeled Text
    Basically a marked text which can then be used by various machine learning or NLP algorithms to label or predict annotation for un-annotated or simply unmarked text.
    Eg: A movie review tagged with positive or negative sentiment.
  • Data Set
    The data on which we intend to perform the task of NLP is data set.
    Eg: Data set of human genome.
  • Attributes
    The data set has certain attributes in general which we use while performing processing. A column in a database table can be considered as attribute.
    Eg: “Wind speed” in weather data.
  • Classification
    When there is list of pre-defined labels then the problem of assigning labels to the data set is called classification.
    Eg: Classifying news articles to their categories, Classifying the tumor benign or malignant.
  • Regression
    When  there is a data set and using that we have to predict a attribute for other attributes, then it is a problem of regression. It is not in general used in NLP.
    Eg: Prediction of weather, Prediction of house rent based on house area.
  • Clustering
    When there are no pre-defined labels but all we know is to find similar type of objects then the problem with which we are dealing is clustering.
    Eg: Clustering the similar kind of reviews or posts.
  • Supervised Learning
    Learning algorithm which learns, actually trains the model which we will talk about in next post, from the previously labeled data. Learning algorithms can be of various types it can be either statistical based or probability based or simple if-else or something else. Some of the algorithms are Naive Bayes, Decision Trees, K Nearest Neighbors, Support Vector Machines  etc. Supervised learning is generally used when we have huge or considerable amount labeled data.
    Eg: Part of Speech Tagging, Severity Prediction for the Bug Data.
  • Unsupervised Learning
    Algorithms which trains the model when there is no label available. These algorithms are generally created by using a “intuition” or heuristic which is nothing but a observation of the data. Some of the algorithms are Artificial Neural Networks, K-Means, Single Linkage etc. The general field where it is used is clustering.
    Eg: Clustering of news articles by their categories, Clustering of users by their data.
  • Semi-Supervised Learning
    This includes both supervised and unsupervised learning. It is used generally on data which have some of it is labeled and largely unlabeled. It learns from both labeled data and labeled after unsupervised learning.
  • Pre-processing
    The task of processing the data set prior to use it for learning the language models. What we basically do in this part of our program is we modify the data set to convert it into a corpus, data which we can feed to the algorithms as they are generic that is work for various kinds of data set. Hence the need is to convert our data so that we can use those algorithms. Below are the three methods which are used in general in NLP pre-processing.
  • N-Gram
    Gram is a unit to measure quantity. In case of NLP gram can be associated with both words and characters.
    Character N-Gram would mean considering ‘n’ consecutive characters at once and same for word. The most general cases are Uni-Gram, single unit, and Bi-Gram, two units.
    Eg: I live in New York.
    Uni-Gram => [‘I’, ‘live’, ‘in’, ‘New’, ‘York’]
    Bi-Gram => [‘I live’, ‘live in’, ‘in New’, ‘New York’]
  • Tokenization
    We break the text into tokens which are in general certain length of words. These extracted phrases or words are called tokens. In simple words, we extract the words or group of certain length of phrases possible from the sentence or text.
  • Stop Words Removal
    Not all words in the language have actual meaning, that is absence of which we do not lost the true meaning or at least gist.
    Eg: Conjugations, prepositions. ‘they’, ‘then’, ‘under’, ‘over’, ‘is’, ‘an’, ‘and’ etc. You can find the list on MySQL website.
    Note: List of stop words depends on the context of the problem.
  • Stemming
    Reducing a word to its root word is called stemming. It has same meaning what we study while we learn about words.
    Eg: ‘education’, ‘educate’, ‘educating’ has the root word ‘educate’.
  • Features
    The tokens which after all kind of pre-processing required and are ready to be fed to our algorithm are called features. Features are those units which we have extracted from the data set and supply to our algorithm so that it can learn. The difference between tokens and features is that when tokens are ready for the algorithm they are called features.
    Eg: Set of words extracted from data set which we can pass to classifier.
  • Features Extraction
    Extraction of features from the tokens. There are various methods to so which we will read about later.
  • Features Selection
    Selection of features from the extracted tokens. Not all tokens are useful, not all kind of data is useful. We have to decide which ones will be good to play with.
  • Training and Testing
    There are two parts, first is trainig where we feed the features to the algorithm then it trains a model or lets say creates a model for those features. Second part is testing where we use the model algorithm has created to test how well it is doing.
  • Evaluation
    There are various methods to evaluate a model. Some common evaluation methods are Accuracy and Precision, Recall, F1-Score.

Note: All the above things may not actually required in all NLP problems but these are some common terms which every NLP enthusiast should be familiar with. There is also a chance of inaccuracy in definition as I tried to explain things in the easiest way possible.

Story of Memcached

One MemcacheClass to rule them all, One MemcacheClass to find them,
One MemcacheClass to bring them all and in the RAMs bind them.

Below is a story I found on memcached. It is an amazing story how memcache work.

Two plucky adventurers, Programmer and Sysadmin, set out on a journey. Together they make websites. Websites with webservers and databases. Users from all over the Internet talk to the webservers and ask them to make pages for them. The webservers ask the databases for junk they need to make the pages. Programmer codes, Sysadmin adds webservers and database servers.

One day the Sysadmin realizes that their database is sick! It’s spewing bile and red stuff all over! Sysadmin declares it has a fever, a load average of 20! Programmer asks Sysadmin, “well, what can we do?” Sysadmin says, “I heard about this great thing called memcached. It really helped livejournal!” “Okay, let’s try it!” says the Programmer.

Our plucky Sysadmin eyes his webservers, of which he has six. He decides to use three of them to run the ‘memcached’ server. Sysadmin adds a gigabyte of ram to each webserver, and starts up memcached with a limit of 1 gigabyte each. So he has three memcached instances, each can hold up to 1 gigabyte of data. So the Programmer and the Sysadmin step back and behold their glorious memcached!

“So now what?” they say, “it’s not DOING anything!” The memcacheds aren’t talking to anything and they certainly don’t have any data. And NOW their database has a load of 25!

Our adventurous Programmer grabs the pecl/memcache client library manual, which the plucky Sysadmin has helpfully installed on all SIX webservers. “Never fear!” he says. “I’ve got an idea!” He takes the IP addresses and port numbers of the THREE memcacheds and adds them to an array in php.

$MEMCACHE_SERVERS = array(
"10.1.1.1", //web1
"10.1.1.2", //web2
"10.1.1.3", //web3
);

Then he makes an object, which he cleverly calls ‘$memcache’.

$memcache = new Memcache();
foreach($MEMCACHE_SERVERS as $server){
$memcache->addServer ( $server );
}

Now Programmer thinks. He thinks and thinks and thinks. “I know!” he says. “There’s this thing on the front page that runs SELECT * FROM hugetable WHERE timestamp > lastweek ORDER BY timestamp ASC LIMIT 50000; and it takes five seconds!” “Let’s put it in memcached,” he says. So he wraps his code for the SELECT and uses his $memcache object. His code asks:

Are the results of this select in memcache? If not, run the query, take the results, and PUT it in memcache! Like so:

$huge_data_for_front_page = $memcache->get("huge_data_for_front_page");
if($huge_data_for_front_page === false){
$huge_data_for_front_page = array();
$sql = "SELECT * FROM hugetable WHERE timestamp > lastweek ORDER BY timestamp ASC LIMIT 50000";
$res = mysql_query($sql, $mysql_connection);
while($rec = mysql_fetch_assoc($res)){
$huge_data_for_frong_page[] = $rec;
}
// cache for 10 minutes
$memcache-&gt;set("huge_data_for_front_page", $huge_data_for_front_page, 0, 600);
}

// use $huge_data_for_front_page how you please

Programmer pushes code. Sysadmin sweats. BAM! DB load is down to 10! The website is pretty fast now. So now, the Sysadmin puzzles, “What the HELL just happened!?” “I put graphs on my memcacheds! I used cacti, and this is what I see! I see traffic to one memcached, but I made three :(.” So, the Sysadmin quickly learns the ascii protocol and telnets to port 11211 on each memcached and asks it:

Hey, ‘get huge_data_for_front_page’ are you there?

The first memcached does not answer…

The second memcached does not answer…

The third memcached, however, spits back a huge glob of crap into his telnet session! There’s the data! Only once memcached has the key that the Programmer cached!

Puzzled, he asks on the mailing list. They all respond in unison, “It’s a distributed cache! That’s what it does!” But what does that mean? Still confused, and a little scared for his life, the Sysadmin asks the Programmer to cache a few more things. “Let’s see what happens. We’re curious folk. We can figure this one out,” says the Sysadmin.

“Well, there is another query that is not slow, but is run 100 times per second. Maybe that would help,” says the Programmer. So he wraps that up like he did before. Sure enough, the server loads drops to 8!

So the Programmer codes more and more things get cached. He uses new techniques. “I found them on the list and the faq! What nice blokes,” he says. The DB load drops; 7, 5, 3, 2, 1!

“Okay,” says the Sysadmin, “let’s try again.” Now he looks at the graphs. ALL of the memcacheds are running! All of them are getting requests! This is great! They’re all used!

So again, he takes keys that the Programmer uses and looks for them on his memcached servers. ‘get this_key’ ‘get that_key’ But each time he does this, he only finds each key on one memcached! Now WHY would you do this, he thinks? And he puzzles all night. That’s silly! Don’t you want the keys to be on all memcacheds?

“But wait”, he thinks “I gave each memcached 1 gigabyte of memory, and that means, in total, I can cache three gigabytes of my database, instead of just ONE! Oh man, this is great,” he thinks. “This’ll save me a ton of cash. Brad Fitzpatrick, I love your ass!”

“But hmm, the next problem, and this one’s a puzzler, this webserver right here, this one runing memcached it’s old, it’s sick and needs to be upgraded. But in order to do that I have to take it offline! What will happen to my poor memcache cluster? Eh, let’s find out,” he says, and he shuts down the box. Now he looks at his graphs. “Oh noes, the DB load, it’s gone up in stride! The load isn’t one, it’s now two. Hmm, but still tolerable. All of the other memcacheds are still getting traffic. This ain’t so bad. Just a few cache misses, and I’m almost done with my work. So he turns the machine back on, and puts memcached back to work. After a few minutes, the DB load drops again back down to 1, where it should always be.

“The cache restored itself! I get it now. If it’s not available it just means a few of my requests get missed. But it’s not enough to kill me. That’s pretty sweet.”

So, the Programmer and Sysadmin continue to build websites. They continue to cache. When they have questions, they ask the mailing list or read the faq again. They watch their graphs. And all live happily ever after.

Author: Dormando via IRC. Edited by Brian Moon for fun. Further fun editing by Emufarmers.

This story has been illustrated by the online comic TOBlender.com.

Chinese translation by Wei Liu.

Source: http://code.google.com/p/memcached/wiki/TutorialCachingStory

Algorithm – N Kings

N Kings

You have to place the N kings in on N squared chess board so that no two kings are in same row and column and do not attack each other.

About input, first line of the input is number of testcases T. Then every next 2 lines are for the testcase. In the first line N is the size of chess board and K is the number of Kings already in place starting from first row. Next line has K numbers. Each “number”(pos) in position i is in the row i and column “number” in which king is already placed. 0 <= K <= N. 0 <= pos[i] <= N-1

About output: Output will be the number of possible ways kings can be placed modulus 1000000007.

Input:
2
3 0

4 1
2

Output:
0
1

 

Algorithm

Solved the problem recursively by setting up board and checking every column for each row if the current position is valid. You can see the solution as Depth First Search as we traverse up till the leaf first(solution). The algorithm uses Constraint Programming by default as the first row after the already placed pieces will always have minimum constraint that is will have less possible positions to start with.

 

Code

You can see the code here. Its in python(2.6.2).