software engineers - ReadWriteWeb http://www.readwriteweb.com/feeds/tag/software engineers en Copyright 2009 Richard MacManus readwriteweb@gmail.com Sun, 22 Nov 2009 19:36:29 -0800 http://www.sixapart.com/movabletype/?v=4.23-en http://blogs.law.harvard.edu/tech/rss 12 Unit Testing Tips for Software Engineers Unit Testing is one of the pillars of Agile Software Development. First introduced by Kent Beck, unit testing has found its way into the hearts and systems of many organizations. Unit tests help engineers reduce the number of bugs, hours spent on debugging, and contribute to healthier, more stable software.

In this post we look at a dozen unit testing tips that software engineers can apply, regardless of their programming language or environment.

]]>Sponsor

]]> 1. Unit Test to Manage Your Risk

A newbie might ask Why should I write tests? Indeed, aren't tests boring stuff that software engineers want to outsource to those QA guys? That's a mentality that no longer has a place in modern software engineering. The goal of software teams is to produce software of the highest quality. Consumers and business users were rightly intolerant of buggy software of the 80s and 90s. But with the abundance of libraries, web services and integrated development environments that support refactoring and unit testing, there's now no excuse for software with bugs.

The idea behind unit testing is to create a set of tests for each software component. Unit tests facilitate continuous software testing; unlike manual tests, it's cheap to perform them repeatedly.

As your system expands, so does the body of unit tests. Each test is an insurance that the system works. Having a bug in the code means carrying a risk. Utilising a set of unit tests, engineers can dramatically reduce number of bugs and the risk with untested code.

2. Write a Test Case Per Major Component

When you start unit testing, always ask What Tests Should I Write?

The initial impulse is to write a bunch of functional tests; i.e., tests that probe different functions of the system. This is not correct. The right thing is to create a test case (a set of tests) for each major component.

The focus of the test is one component at a time. Within each component, look for an interface - a set of publicly exposed behaviour that component offers. You then should write at least one test per public method.

3. Create Abstract Test Case and Test Utilities

As with any code, there will be common things all your tests need to do. Start with finding a unit testing for your language. For example, in Java, engineers use JUnit - a simple yet powerful framework for writing tests in Java. The framework comes with TestCase class, the base class for all tests. Add convenient methods and utilities applicable to your environment. This way, all your tests cases can share this common infrastructure.

4. Write Smart Tests

Testing is time-consuming, so ensure your tests are effective. Good tests probe the core behaviour of each component, but do it with the least code possible. For example, there is very little reason in writing tests for Java Bean setter and getter methods, for these will be tested anyway.

Instead, write a test that focuses on the behaviour of the system. You don't need to be comprehensive; create the tests that come to mind now, then be ready to come back to add more.

5. Set up Clean Environment for Each Test

Software engineers are always concerned with efficiency, so when they hear that each test needs to be set up separately they worry about performance. Yet setting up each test correctly and from scratch is important. The last thing you want is for the test to fail because it used some old piece of data from another test. Ensure each test is set up properly and don't worry about efficiency.

In cases when you have a common environment for all tests - which doesn't change as tests run - you can add a static set up block to your base test class.

6. Use Mock Objects To Test Effectively

Setting up tests is not that simple; and at first glance sometimes seems impossible. For example, if using Amazon Web Services in your code, how can you simulate it in the test without impacting the real system?

There are a couple of ways. You can create fake data and use that in tests. In the system that has users, a special set of accounts can be utilised exclusively for testing.

Running tests against a production system is risky: what if something goes wrong and you delete actual user data? An alternative is fake data, called stubs or mock objects.

A mock object implements a particular interface, but returns predetermined results. For example, you can create a mock object for Amazon S3 which always reads files from your local disk. Mock objects are helpful when testing complex systems with lots of components. In Java, several frameworks help create mock objects, most notably JMock.

7. Refactor Tests When You Refactor the Code

Testing only pays if you really invest in it. Not only do you need to write tests, you also need to ensure they're up to date. When adding a new method to a component, you need to add one or more corresponding tests. Just like you should clean out unused code, also remove tests that are no longer applicable.

Unit tests are particularly helpful when doing large refactorings. Refactoring focuses on continuous sculpting of the code to help it stay correct. After you move code around and fix the tests, rerunning all the related tests ensures you didn't break anything while changing the system.

8. Write Tests Before Fixing a Bug

Unit tests are effective weapons in the fight against bugs. When you uncover a problem in your code, write a test that exposes this problem before fixing the code. This way, if the problem reappears, it will be caught with the test.

It is important to do this since you can't always write comprehensive tests right away. When you add a test for a bug, you're filling in the gap in your original tests in a disciplined way.

9. Use Unit Tests to Ensure Performance

In addition to guarding correctness of the code, unit tests can help ensure the performance of your code doesn't degrade over time. In many systems slowness creeps in as the system grows.

To write performance tests, you need to implement start and stop functions in your base test class. When appropriate you can use a time-particular method or code and assert that the elapsed time is within the limits of the desired performance.

10. Create Tests for Concurrent Code

Concurrent code is notoriously tricky and typically a source of many bugs. This is why it's important to unit test concurrent code. The way to do this is by using a system of sleeps and locks. You can write in sleep calls in your tests if you need to wait for a particular system state. While this is not a 100% correct solution, in many cases it's sufficient. To simulate concurrency in a more sophisticated scenario, you need to pass locks around to the objects you're testing. In doing so, you will be able to simulate concurrent system, but sequentially.

11. Run Tests Continuously

The whole point of tests is to run them a lot. Particularly in larger teams where dozens of developers are working on a common code base, continuous unit testing is important. You can set up tests to run every few hours or you can run them on each check-in of the code or just once a day (typically overnight). Decide which method is the most appropriate for your project and make the tests run automatically and continuously.

12. Have Fun Testing!

Probably the most important tip is to have fun. When I first encountered unit testing, I was sceptical and thought it was just extra work. But I gave it a chance, because smart people who I trusted told me that it's very useful.

Unit testing puts your brain into a state which is very different from coding state. It is challenging to think about what is a simple and correct set of tests for this given component.

Once you start writing tests, you'd wonder how you ever got by without them. To make tests even more fun, you can incorporate pair programming. Whether you get together with fellow engineers to write tests or write tests for each other's code, fun is guaranteed. At the end of the day, you will be comfortable knowing your system really works because your tests pass.

And now please join the conversation! Share unit testing lessons from your projects with all of us.

]]>Discuss]]>
http://www.readwriteweb.com/archives/12_unit_testing_tips_for_software_engineers.php http://www.readwriteweb.com/archives/12_unit_testing_tips_for_software_engineers.php Analysis Thu, 14 Aug 2008 00:45:12 -0800 Alex Iskold
Top 10 Concepts That Every Software Engineer Should Know The future of software development is about good craftsmen. With infrastructure like Amazon Web Services and an abundance of basic libraries, it no longer takes a village to build a good piece of software.

These days, a couple of engineers who know what they are doing can deliver complete systems. In this post, we discuss the top 10 concepts software engineers should know to achieve that.

]]>Sponsor

]]> A successful software engineer knows and uses design patterns, actively refactors code, writes unit tests and religiously seeks simplicity. Beyond the basic methods, there are concepts that good software engineers know about. These transcend programming languages and projects - they are not design patterns, but rather broad areas that you need to be familiar with. The top 10 concepts are:

  1. Interfaces
  2. Conventions and Templates
  3. Layering
  4. Algorithmic Complexity
  5. Hashing
  6. Caching
  7. Concurrency
  8. Cloud Computing
  9. Security
  10. Relational Databases

10. Relational Databases

Relational Databases have recently been getting a bad name because they cannot scale well to support massive web services. Yet this was one of the most fundamental achievements in computing that has carried us for two decades and will remain for a long time. Relational databases are excellent for order management systems, corporate databases and P&L data.

At the core of the relational database is the concept of representing information in records. Each record is added to a table, which defines the type of information. The database offers a way to search the records using a query language, nowadays SQL. The database offers a way to correlate information from multiple tables.

The technique of data normalization is about correct ways of partitioning the data among tables to minimize data redundancy and maximize the speed of retrieval.

9. Security

With the rise of hacking and data sensitivity, the security is paramount. Security is a broad topic that includes authentication, authorization, and information transmission.

Authentication is about verifying user identity. A typical website prompts for a password. The authentication typically happens over SSL (secure socket layer), a way to transmit encrypted information over HTTP. Authorization is about permissions and is important in corporate systems, particularly those that define workflows. The recently developed OAuth protocol helps web services to enable users to open access to their private information. This is how Flickr permits access to individual photos or data sets.

Another security area is network protection. This concerns operating systems, configuration and monitoring to thwart hackers. Not only network is vulnerable, any piece of software is. Firefox browser, marketed as the most secure, has to patch the code continuously. To write secure code for your system requires understanding specifics and potential problems.

8. Cloud Computing

In our recent post Reaching For The Sky Through Compute Clouds we talked about how commodity cloud computing is changing the way we deliver large-scale web applications. Massively parallel, cheap cloud computing reduces both costs and time to market.

Cloud computing grew out of parallel computing, a concept that many problems can be solved faster by running the computations in parallel.

After parallel algorithms came grid computing, which ran parallel computations on idle desktops. One of the first examples was SETI@home project out of Berkley, which used spare CPU cycles to crunch data coming from space. Grid computing is widely adopted by financial companies, which run massive risk calculations. The concept of under-utilized resources, together with the rise of J2EE platform, gave rise to the precursor of cloud computing: application server virtualization. The idea was to run applications on demand and change what is available depending on the time of day and user activity.

Today's most vivid example of cloud computing is Amazon Web Services, a package available via API. Amazon's offering includes a cloud service (EC2), a database for storing and serving large media files (S3), an indexing service (SimpleDB), and the Queue service (SQS). These first blocks already empower an unprecedented way of doing large-scale computing, and surely the best is yet to come.

7. Concurrency

Concurrency is one topic engineers notoriously get wrong, and understandibly so, because the brain does juggle many things at a time and in schools linear thinking is emphasized. Yet concurrency is important in any modern system.

Concurrency is about parallelism, but inside the application. Most modern languages have an in-built concept of concurrency; in Java, it's implemented using Threads.

A classic concurrency example is the producer/consumer, where the producer generates data or tasks, and places it for worker threads to consume and execute. The complexity in concurrency programming stems from the fact Threads often needs to operate on the common data. Each Thread has its own sequence of execution, but accesses common data. One of the most sophisticated concurrency libraries has been developed by Doug Lea and is now part of core Java.

6. Caching

No modern web system runs without a cache, which is an in-memory store that holds a subset of information typically stored in the database. The need for cache comes from the fact that generating results based on the database is costly. For example, if you have a website that lists books that were popular last week, you'd want to compute this information once and place it into cache. User requests fetch data from the cache instead of hitting the database and regenerating the same information.

Caching comes with a cost. Only some subsets of information can be stored in memory. The most common data pruning strategy is to evict items that are least recently used (LRU). The prunning needs to be efficient, not to slow down the application.

A lot of modern web applications, including Facebook, rely on a distributed caching system called Memcached, developed by Brad Firzpatrick when working on LiveJournal. The idea was to create a caching system that utilises spare memory capacity on the network. Today, there are Memcached libraries for many popular languages, including Java and PHP.

5. Hashing

The idea behind hashing is fast access to data. If the data is stored sequentially, the time to find the item is proportional to the size of the list. For each element, a hash function calculates a number, which is used as an index into the table. Given a good hash function that uniformly spreads data along the table, the look-up time is constant. Perfecting hashing is difficult and to deal with that hashtable implementations support collision resolution.

Beyond the basic storage of data, hashes are also important in distributed systems. The so-called uniform hash is used to evenly allocate data among computers in a cloud database. A flavor of this technique is part of Google's indexing service; each URL is hashed to particular computer. Memcached similarly uses a hash function.

Hash functions can be complex and sophisticated, but modern libraries have good defaults. The important thing is how hashes work and how to tune them for maximum performance benefit.

4. Algorithmic Complexity

There are just a handful of things engineers must know about algorithmic complexity. First is big O notation. If something takes O(n) it's linear in the size of data. O(n^2) is quadratic. Using this notation, you should know that search through a list is O(n) and binary search (through a sorted list) is log(n). And sorting of n items would take n*log(n) time.

Your code should (almost) never have multiple nested loops (a loop inside a loop inside a loop). Most of the code written today should use Hashtables, simple lists and singly nested loops.

Due to abundance of excellent libraries, we are not as focused on efficiency these days. That's fine, as tuning can happen later on, after you get the design right.

Elegant algorithms and performance is something you shouldn't ignore. Writing compact and readable code helps ensure your algorithms are clean and simple.

3. Layering

Layering is probably the simplest way to discuss software architecture. It first got serious attention when John Lakos published his book about Large-scale C++ systems. Lakos argued that software consists of layers. The book introduced the concept of layering. The method is this. For each software component, count the number of other components it relies on. That is the metric of how complex the component is.

Lakos contended a good software follows the shape of a pyramid; i.e., there's a progressive increase in the cummulative complexity of each component, but not in the immediate complexity. Put differently, a good software system consists of small, reusable building blocks, each carrying its own responsibility. In a good system, no cyclic dependencies between components are present and the whole system is a stack of layers of functionality, forming a pyramid.

Lakos's work was a precursor to many developments in software engineering, most notably Refactoring. The idea behind refactoring is continuously sculpting the software to ensure it'is structurally sound and flexible. Another major contribution was by Dr Robert Martin from Object Mentor, who wrote about dependecies and acyclic architectures

Among tools that help engineers deal with system architecture are Structure 101 developed by Headway software, and SA4J developed by my former company, Information Laboratory, and now available from IBM.

2. Conventions and Templates

Naming conventions and basic templates are the most overlooked software patterns, yet probably the most powerful.

Naming conventions enable software automation. For example, Java Beans framework is based on a simple naming convention for getters and setters. And canonical URLs in del.icio.us: http://del.icio.us/tag/software take the user to the page that has all items tagged software.

Many social software utilise naming conventions in a similar way. For example, if your user name is johnsmith then likely your avatar is johnsmith.jpg and your rss feed is johnsmith.xml.

Naming conventions are also used in testing, for example JUnit automatically recognizes all the methods in the class that start with prefix test.

The templates are not C++ or Java language constructs. We're talking about template files that contain variables and then allow binding of objects, resolution, and rendering the result for the client.

Cold Fusion was one of the first to popularize templates for web applications. Java followed with JSPs, and recently Apache developed handy general purpose templating for Java called Velocity. PHP can be used as its own templating engine because it supports eval function (be careful with security). For XML programming it is standard to use XSL language to do templates.

From generation of HTML pages to sending standardized support emails, templates are an essential helper in any modern software system.

1. Interfaces

The most important concept in software is interface. Any good software is a model of a real (or imaginary) system. Understanding how to model the problem in terms of correct and simple interfaces is crucial. Lots of systems suffer from the extremes: clumped, lengthy code with little abstractions, or an overly designed system with unnecessary complexity and unused code.

Among the many books, Agile Programming by Dr Robert Martin stands out because of focus on modeling correct interfaces.

In modeling, there are ways you can iterate towards the right solution. Firstly, never add methods that might be useful in the future. Be minimalist, get away with as little as possible. Secondly, don't be afraid to recognize today that what you did yesterday wasn't right. Be willing to change things. Thirdly, be patient and enjoy the process. Ultimately you will arrive at a system that feels right. Until then, keep iterating and don't settle.

Conclusion

Modern software engineering is sophisticated and powerful, with decades of experience, millions of lines of supporting code and unprecidented access to cloud computing. Today, just a couple of smart people can create software that previously required the efforts of dozens of people. But a good craftsman still needs to know what tools to use, when and why.

In this post we discussed concepts that are indispensible for software engineers. And now tell us please what you would add to this list. Share with us what concepts you find indispensible in your daily software engineering journeys.

Image credit: cbtplanet.com

]]>Discuss]]>
http://www.readwriteweb.com/archives/top_10_concepts_that_every_software_engineer_should_know.php http://www.readwriteweb.com/archives/top_10_concepts_that_every_software_engineer_should_know.php Analysis Tue, 22 Jul 2008 20:21:07 -0800 Alex Iskold
The American Dream: 17 Years of Engineering Software Seventeen years ago, on April 10th 1991, a plane landed in John F. Kennedy airport. That plane had just crossed the Atlantic carrying, amongst others, passengers escaping the crumbling Soviet empire. One of whom was me. I walked off that plane with a first ever taste of Coca-Cola in my mouth, a lame teenage mustache, and not a clue about what to expect.

]]>Sponsor

]]> When my sister emailed me on April 10th 2008 and reminded me of our immigration anniversary, I was suddenly overwhelmed with memories. A lot has happened since then. 17 years is such a long time that it is difficult to fathom. I am left with bits and pieces of memories and the person that I am today. Each memory by itself is rarely strong and profound. A single memory is a just a dot in your timeline. But when you pile the memories on top of each other, you get a bigger and better picture. Here is to everyone who made my American Dream come true and all of you who helped me grow as a software engineer.

Lehigh University: The Basics

I went to engineering school: Lehigh University in Bethlehem PA. My credits from Ukraine got me into the sophomore year and I immediately declared a math major. I was always a good student, but I never loved math (can you blame me?). It was always too abstract and too detached from reality. I knew how to manipulate formulas, but I had no idea why I was doing it.

My mom was concerned about my future. She kept telling me that there is no money in Math and that I should learn computers - the thing of the future. Back then I was scared of computers. My only prior encounter with them was back in Ukraine where a computer was a humongous piece of metal. The only program that I'd written in Basic to multiply matrices had an infinite loop in it and an angry professor had to reboot the whole machine to stop it. I got a C in that class.

So when I realized that the path to my happiness was in computers, I was kind of scared. To top it off, the first programming class that I took was Introduction to Computer Engineering, focused on coding in Assembly language. The final project was to write an editor in Assembly 8086; and for over three weeks I was trying all possible combinations of letters and digits that could make the program run. I got it, but it was really like monkeys typing Shakespeare.

Strangely enough, that did not stop me and I then took Data Structures in Pascal. My professor, Dr. Adair Dingle, was probably the reason I stuck with programming. For the first time I was fascinated with computer science, got 100 on a test and coded something that actually ran. She was great and made me believe that I can do it. So I declared a minor in Computer Science.

In my senior year I took a Systems Programming class from a guy named Stephen Corbissero. He was not a professor, but he was the best teacher in the CS department because he actually knew how things worked. He could code in C and he knew Unix inside out. I was scared of him and of all electrical engineers that loved him. But I really wanted to learn C, so I took the class. As a final project we had to write a Unix Shell. It was hard for me, really really hard. I spent weeks in the lab working on this class. In the end I got an A- and learned that I can plunge through hard problems if I keep at them. And thanks to this class, I also got the skills needed to get my first job.

Goldman Sachs: Motif, C and Passion

In April of 1994 I had 3 offers. The first one was from Goldman Sachs in New York to work on Wall Street. My second offer was from IBM in Virginia to work on an aeronautics project and the last offer was to join the programming staff of the pharmaceutical giant Merck. I did not want to get clearance nor was I excited enough about Merck, so I packed my bag and went to where the action is at - New York City.

Goldman has always been an amazing company and back in 1994 it was still privately held. It was famous for hiring smart, capable college kids and then making them work really hard, while paying good salaries and fantastic end of year bonuses. During one of the interviews, I was asked to explain how Hashtables worked. To this day this is my favorite introductory technical question.

But Goldman had no illusions about our skills. College graduates where expected to have only theoretical knowledge, and so for the 2 months during the summer we were put through a training program called NAPA (new associate programmer analyst). The main objective was to make sure that we get out knowing how to program in C.

Since most of us had no idea what was the difference between char* and char** (the latter one was just scary), there was a lot of work to be done. Not only did we have to learn C well, we also needed to learn the X Window environment - a de facto standard on Wall Street in the early nineties. X Window came out of MIT and was a set of amazing client-server libraries for building graphical applications. We learned the raw X Window library, the layer above Xt and the widget layer called Motif. I really did not understand how everything worked, but I got a sense of how powerful abstractions, libraries and layers can be.

My first project was to work on an account reconciliation system. Lots of systems in any financial institution are focused on reconciliation. Since any discrepancy can cost the company millions of dollars, the correctness of all books is of paramount importance. Back then the system ran a nightly batch that transferred data into a relational database. The interface was written in Motif and allowed managers to flip through thousands of bits of information. It had a striking resemblance to Excel - a table with columns that could be sorted and searched. But it needed to be custom, because Wall Street was all about custom IT.

I spent 2 years working on Financial systems in Goldman, mastering C and X libraries, picking up Tcl/Tk, learning SQL and Sybase and a development environment called TeleUSE with its C-based scripting language D. In the process, I learned regular expressions, Awk and a bit of Perl, although I never developed a taste for any them. But I got infected and I became very curious about programming. I wanted to do it well, very well. And so once again, like back in college, I spent my time plunging through problems. I would work 16+ hour days, going home only to shower and get some quick sleep. I would swallow programming books one after another and spend endless hours talking to people about code.

Back in Goldman I made a few good friends who stayed with me in my journey through the world of programming. One of them in particular made a big impact on me. No matter what, he would always figure stuff out. He was sharp, but more than that, he applied common sense. This was the tool that I lacked and he completely mastered. Looking back now, I realize that he was the first master of patterns that I ever met. And even though consciously I did not express it, the bug of patterns was planted inside of me. From then on I would be on an intense search for patterns in programming, science and life.

D.E.Shaw & Co: C++ and sharks

After spending 2 years at Goldman I was feeling bored. Not that I had mastered programming - far from it. I just felt that there was something better out there. Another friend of mine left the Fixed Income group to join a company called Juno, a spin off from a high tech investment fund called D.E.Shaw & Co. David E. Shaw is a famous computer scientist in Columbia University. In 1988 he started a high-tech hedge fund, focused on quantitative trading. Known for his exceptional intellect, he was able to attract PhD graduates from the top schools in the country, top-notch Unix hackers and incredibly bright humanity majors. Mr. Shaw created a culture of secrecy and eliteness - the firm was purposefully mysterious about its processes, strategies and plans.

To even land an interview in that place was hard, but to get a job was nearly impossible, because the interviewers asked very difficult programming problems and math puzzles. In all honesty, I could not have passed the interview had certain questions been asked. But serendipity and luck were on my side. They asked me questions that I could answer and they nodded when I passionately told them about my work at Goldman. To my big surprise I got an offer and became employee #223. I had no idea what I had gotten myself into.

My new boss was one of the most incredible people I've ever met. He rarely slept and emitted ideas with the speed of light. On my first day on the job he pulled me into the office and said that tomorrow morning was a deadline for an important demo - sending orders to the exchange using the Java client. Java? This was a hot new language that had just came out of Sun. I did not really know C++ let alone Java. In any case, my first day on the job was the first of many "all-nighters". The demo worked, but only thanks to another engineer who actually got it to work. After the first day (and night) I had a hunch that this place was going to be fun.

D.E.Shaw was one of the most competitive environments ever created. Assembling an impressive number of very smart people in one place has its pluses and minuses. Everyone competed really hard. Fortunately for me, I was on the bottom of the food chain and one of the least knowledgeable employees. It was a perfect learning environment - I was absorbing information like a sponge.

At D.E.Shaw I learned the intricacies of C++. It was not easy, but I had good teachers. A senior engineer, who later become my boss and mentor, knew C++ really well because he'd previously built massive Power Grid simulations for ConEdision. He was also the first person who explained the power of object-oriented programming to me. I remember reading Effective C++ books and holding in my hands one of the first copies of the famous Design Patterns books. I became a decent C++ programmer and I started to really understand some engineering principles, but I was still confused. C++ is a complicated language, where you really need to know how to program. It was hard to separate the what from how and hard to see the bigger picture.

We were building an automated trading system that maintained a list of outstanding positions and made decisions based on history and market conditions. The program would process quotes from the exchange and write them to an object database called Object Store at the rate of 2,000 per second. It would apply sophisticated rule-based decision making to decide whether to keep a position or trade something automatically. The entire system was quite complex and took years to develop.

I felt like I lacked Computer Science fundamentals to understand all the details and so I enrolled into a Masters program at Courant Institute at NYU. Unfortunately I was disappointed. Most of the classes were rather basic and outdated. Most professors where interested in research and found the basics trivial. To compensate for lack of excitement at school, I started learning more and more on my own.

At the same time D.E.Shaw was not doing as well. The company quickly grew to over 1,000 people and lost some of its talent to a rising internet startup - Amazon.com. As it turns out, the Third Market group where I worked was previously headed by none other than Jeff Bezos. He left a few months before I joined and was actively recruiting top talent to work for him in Seattle. Many D.E.Shaw alumnus became instrumental to Amazon's success. A lot of people that I liked to work with left and I felt that it was time for me to move on as well.

Thinkmap: Java, Networks and Simplicity

I joined an information visualization startup called Plumb Design that was developing an innovative technology called Thinkmap. The idea behind Thinkmap was to create an abstraction for visualizing and navigating any data set as a network. The system was architected around several basic layers - data adapters took care of taking information and mapping it into the nodes and edges of the network. The next layer was responsible for arranging the network in space. The layer above, the most fascinating one, created motion by utilizing formulas from physics. The final layer was the visual one: a projection from the animation space onto the flat screen.

Thinkmap was capable of visualizing a wide array of information. From a Thesaurus to Internet Movie Database, from Musical Albums to Books and even Java code. It was during my time at Plumb that I realized that everything in the world was about networks. I became fascinated with this field and soon discovered a branch of science called Complexity.

The study of complex systems is the study of unifying themes that exist between different scientific disciplines. Scientists discovered that things as diverse as grains of sand, economies, ecologies, physical particles and galaxies obey common laws. All of these systems can be represented as a networks of information exchange, where the next level of complexity arises naturally from the interplay of the nodes on the lower level. What fascinated me was that by representing a complex system as a network, you can create a model that helps you understand how the system behaves. I sensed that Complexity science was the most profound thing I had ever encountered and that the universal patterns that I was seeking were explained by it.

While I spent all my free time reading about complexity, at work I was mastering Java. My new boss was tough and demanding. He was the biggest perfectionist I ever met. Very creative and very smart, he could write any piece of code faster, better and most importantly, simpler. During my time at Plumb, during each encounter he would remind me that I need to make things simpler. It was both frustrating, because I was never good enough, but it was also very educational. Without a doubt, that experience made me a stronger person, preparing me for the future.

It was at Plumb Design that I really started to master programming. Some of the code that I'd written there had the elegance and beauty that is so intrinsic to all good code. To model a system correctly, you needed to think of it in terms of interfaces. Each building block itself would be simple, but when you arranged them together so that they fit, a new set of behaviors would arise. One day Complexity Science and Java programming converged. I realized that code is just like complex systems: a bigger whole arises through the interplay of its parts.

NYU: 5 years of Software Engineering for Undergrads

At the same time I was finishing up my masters at NYU. Once I was hanging around the department and jokingly said that back in Ukraine I wanted to be a teacher. The department chair jumped on that and asked if I wanted to teach an undergraduate programming class. I figured that it could not be all that bad and signed up to teach Introduction to Programming. To be honest, it was bad and it was hard. The kids had no idea what programming was and did not really want to learn it either. Despite the fact that the class was not a big success, the department asked me to do another one. I said that instead of Pascal I want to teach an advanced class in Java.

And so was born the Software Engineering in Java class, one of the biggest adventures of my career. I taught this class 5 times and each time it was so much fun. It was really intense - all of the best, most cutting edge stuff I knew, I shared with NYU CS seniors. We covered Java topics like Exceptions, Reflection, Threads, Sockets and RMI. We learned how to persist JavaBeans in XML and how to do relational databases in Java. We covered basic design patterns, unit testing, refactoring and other principles of agile software engineering. But the best part of the class was that we had a semester long project that modeled a classic complex system - a pond environment where digital creatures fought for their survival.

The class won the award for outstanding teaching, but the biggest reward was the comments that I got from students. They felt that unlike any other class that they took, this one was really preparing them for their career. Many years after my graduation I returned a favor. Like Stephen Corbissero at Lehigh University, at NYU I created a course that was based on pragmatic things that engineers do in the field, not some theoretical ideas that never see the light outside of academia.

To this day, I get emails from my students thanking me for the class. It makes me both proud and happy. But as much as they are grateful to me, I am thankful to them much more. Because as you know, the best way to learn is to teach. It is teaching this class that really made me into the software engineer that I am today.

Information Laboratory: Small Worlds and Large-Scale Software

In the summer of 2000 I became convinced that Complexity Science had many business applications. On a whim I decided to start a company called Information Laboratory that would turn the insights of complexity science into a piece of software. I envisioned a powerful library, a modeling toolkit, that would help people understand the behavior of diverse complex systems - from software and business organizations to power grids and traffic flows. At the heart of this library would be networks or mathematical graphs. For each situation there would be components to adapt the information into the data layer. Once the system was represented as a network, it would be analyzed using a set of graph algorithms.

Inspired by the insights in the recent paper by Cornell's PhD student, Duncan Watts, we realized that a lot can be said about the behavior of a system just by looking at its structure. As it turns out, there are not that many ways for nodes to be wired together. Some nodes in a network look perfectly balanced - inputs are equal to outputs. But some are not and those are very interesting. For example, there are nodes that have a lot of inputs and just a few outputs or the other way around. The question that we wanted to answer was: What do these nodes mean in different systems? For example in power grid, a node with a lot of incoming connections and just a few outputs implies a potential outage point. Looking at communication pathways in a company, a hub - the person who receives and disseminates a lot of information - is a valuable employee. And in software, a component that does not depend on any other but has a lot of dependents needs to be handled with care.

It is the software analysis that soon became our primary focus. We realized that analyzing software structure is a powerful way of identifying, preventing and solving architectural problems. For example, in software the component that would have a lot of dependencies would be vulnerable to changes. We called such components 'breakable' and considered it bad. Another bad structure would be a hub, since it would have a lot of dependencies and dependents. But worst of all would be something that we dubbed a 'tangle' - a set of components interdependent via multiple loops. The result of our insights was a software architecture tool called Small Worlds.

The tool was written entirely in Java and featured sophisticated graph visualizations and algorithms. It worked by reading Java class files of other software and constructed a gigantic network where all components and dependencies were captured. The tool performed automatic structural analysis and identified problematic components - breakables, hubs and tangles. The result of the analysis was a report and the architectural score of the entire system. In addition the tool offered insights into the causes of the issues that it identified and aimed to help architects keep their large-scale systems clean.

IBM: Eclipse, Code Review and Rational Software Architect

In July 2003 IBM acquired Information Laboratory, aiming to roll Small Worlds into their product line. Just a few months before that IBM had acquired Rational Software - the maker of popular software development and modeling tools. Post acquisition I joined the software quality group as the Architect of Code Analysis tools. Needless to say a switch from a tiny startup to the biggest software maker in the world was not easy. At first, most of my time was consumed figuring out how things worked and how to make anything happen. The original plan was to keep Small Worlds as a standalone product, but soon it was clear that it wasn't to be. IBM was planning the roll out of the next generation of its programming tools: Rational Developer and Rational Architect, both based on their open source offering called Eclipse. So IBM renamed Small Worlds to Structural Analysis for Java (SA4J) and made if freely available via its alpha works program.

The next challenge was to rebuild the tool so that it fit into IBM's product line and marketing plans. As the result, it was split into two pieces - one ended up being part of the Rational Architect offering as Structural Patterns. The second piece, called Code Review, is something that we built entirely from scratch. While SmallWorlds was focused on architectural problems, Code Review found a range of issues from security violations to redundant code to logical flaws. It also offered automatic refactoring that with a touch of a button helped developers fix their code.

Learning Eclipse API was a not a lot of fun, and getting the product done with a small team in a matter of 9 months was really a 'mission impossible'. We had to balance internal politics with the pressure of the schedule and inability of management to make up their mind. Remarkably, our team succeeded, largely because we focused on code more than politics. Code Review was ready on time and was shipped in the first version of Rational Developer.

But the entire experience was disappointing. I realized that at the end of the day it was not about building quality tools or doing the right thing. Political and slow, the software quality group also was known for its inability to build quality software on time. I felt that this was too hypocritical for me to stick around.

Data Synapse: Virtualization

I left IBM to become chief architect of Data Synapse, the grid computing company based in New York. Briefly in 2000, during my first month as the founder of Information Laboratory, I helped Data Synapse with their original grid server infrastructure. Five years later when I got a call from a founder to join full-time, I was intrigued. Data Synapse aimed to build its second product, an on demand virtualization infrastructure for J2EE. The idea was to enable dynamic provisioning of application servers to meet the changing demands of an enterprise throughout a day. In a way, this was a more sophisticated precursor of what EC2 is today. And I just could not resist this project.

Without a doubt, this was the most challenging piece of software I ever dealt with. Its core was a sophisticated scheduling algorithm that orchestrated a grid with thousands of servers. Each server was provisioned with bundles containing a stripped down version of Apache, Tomcat, WebLogic, JBoss, WebSphere and many other grid containers. Each application would be deployed to the central broker and then distributed to each node on the grid. As an input, the broker would get a schedule indicating when each application needed to run. Each grid application included a set of agents that monitored its characteristics - such as throughput, memory load, disk usage, etc. Based on the current state of the grid and target performance rules, the broker would decide how to allocate the limited resources.

The result of many months of work was the first version of DataSynapse's FabricServer. As soon as the product was released it was piloted at major banks - Wachovia and Bank of America. Financial institutions were always on the cutting edge of grid computing, because of their need for massively parallel risk computations. And when the J2EE virtualization became available, the banks were first in line to give it a try. Running Fabric Server in a real environment proved to be yet another challenge. In the early days we would constantly uncover stuff in the field that we would not have thought of back in the office. But as time went by, the product worked as expected in more and more situations. This system of enormous complexity really did work.

AdaptiveBlue: JavaScript, Mozilla and Amazon Web Services

In February 2006 I founded my second company - AdaptiveBlue. While Information Laboratory was all about structure, AdaptiveBlue is focused on what can be done with semantics. Fascinated with the ideas of the Semantic Web and smart browsing, I dived into the world of new web technologies. Switching to JavaScript and Mozilla platform was not easy, but through the years I have learned to adapt and embrace new technologies.

Today our software is a mix of JavaScript, Mozilla XPCOM and XUL on the front end. The back end has some PHP scripting, but mostly it is written in Java. Both back end and front end share the same XML infrastructure allowing us to make easy changes and extensions to the system. To scale to hundreds of thousands of users, we chose to architect our software around Amazon Web Services - the most reliable web-scale infrastructure available today. We also heavily use available libraries and try to not re-invent the wheel. In short, we focus on the application itself and on the user experience. The technology is just the means to enable our business.

ReadWriteWeb: The Reflections

If you've reached this sentence, you must have realized that I consider myself exceptionally fortunate. I've had so many different experiences, learned from so many bright people, built amazing software, discovered the power of complex systems and had a lot of great students. In the last 17 years I've truly lived my American Dream. Of course my character, determination and passion are also responsible for my life. Yet, without the opportunities that I've had, none of what I've done would've been possible. America, in my mind, is all about the opportunities.

The latest opportunity that I was given was to be a contributor to this wonderful blog, ReadWriteWeb. Being able to cover technical trends, to share my views and most importantly to learn from all of our readers is a true privilege. I am grateful to Richard, the writers and to all of you for this unique experience. I hope that my journey so far has been both interesting and inspirational for you. Here is to the American Dream and endless possibilities.

]]>Discuss]]>
http://www.readwriteweb.com/archives/the_american_dream_17_years_of_software_engineering.php http://www.readwriteweb.com/archives/the_american_dream_17_years_of_software_engineering.php Personal Sun, 13 Apr 2008 16:23:11 -0800 Alex Iskold
Top 10 Traits of a Rockstar Software Engineer Every company is a tech company these days. From software startups to hedge funds to pharmaceutical giants to big media, they're all increasingly in the business of software. Quality code has become not only a necessity, but a competitive differentiator. And as companies compete around software, the people who can make it happen - software engineers - are becoming increasingly important. But how do you spot the 'cream of the crop' programmers? In this post we outline the top ten traits of a rockstar developer.

]]>Sponsor

]]> We've written here before about the future of software development, in which a few smart developers can leverage libraries and web services to build large-scale systems of unprecedented complexity. It only takes a couple of smart engineers to create quality software of immense value, and below is a list of the top ten qualities you should look for when hiring a developer:

  1. Loves To Code
  2. Gets Things Done
  3. Continuously Refactors Code
  4. Uses Design Patterns
  5. Writes Tests
  6. Leverages Existing Code
  7. Focuses on Usability
  8. Writes Maintainable Code
  9. Can Code in Any Language
  10. Knows Basic Computer Science

1. Loves To Code

Programming is a labor of love. Like any occupation, truly great things are achieved only with passion. It is a common misconception that writing code is mechanical and purely scientific. In truth, the best software engineers are craftsman, bringing energy, ingenuity, and creativity to every line of code. Great engineers know when a small piece of code is shaping up perfectly and when the pieces of a large system start to fit together like a puzzle. Engineers who love to code derive pleasure from building software in much the same way a composer might feel ecstatic about finishing a symphony. It is that feeling of excitement and accomplishment that makes rockstar engineers love to code.

2. Gets Things Done

There are plenty of technical people out there who talk about software instead writing it. One of the most important traits of a great software engineer is that they actually code. They actually get things done. Smart people know that the best way to solve problems is go straight at them. Instead of spending weeks designing complex, unnecessary infrastructure and libraries, a good engineer should ask: What is the simplest path to solving the problem at hand? The recent methodologies for building software, called Agile practices, focus on just that. The idea is to break complex projects into short iterations, each of which focuses on a small set of incremental features. Because each iteration takes just a few weeks to code, the features are manageable and simple. Teams that follow agile practices never create infrastructure for its own sake, instead they are focused on addressing a simple set of requirements. The secret is that when this approach is applied iteratively, a rich, complex piece of software arises naturally.

3. Continuously Refactors Code

Coding is very much like sculpting. Just like an artist is constantly perfecting his masterpiece, an engineer continuously reshapes his code to meet requirements in the best possible way. The discipline of reshaping code is known as refactoring and was formally described by Martin Fowler in his seminal book. The original idea behind refactoring was to improve code without changing what it does, moving pieces of the software around to ensure that the system is free of rot and also does what it is supposed to do based on current requirements. Continuous refactoring allows developers to solve another well-known problem - black box legacy code that no one wants to touch. For decades engineering culture dictated that you should not change the things that work. The issue, though, is that over time you become a slave to the old code, which grows unstable and incompatible. Refactoring changes that, because instead of the code owning you, you own the code. Refactoring establishes ongoing dialogue between the engineer and the code and leads to ownership, certainty, confidence, and stability in the system.

4. Uses Design Patterns

Ever since the so called Gang of Four published their famous Design Patterns book, world-class engineers have been talking about patterns. Patterns are ubiquitous in our world - both in nature and all human endeavors; software engineering is no exception. Patterns are recurrent scenarios and mechanisms that live across languages and systems. A good engineer always recognizes and leverages patterns, but is not driven by them. Instead of trying to fit the system into a set of patterns, the engineer recognizes opportunities in which to apply patterns. Applying a pattern ensures correctness since it leverages existing know-how: a method for solving a particular engineering problem that has worked before.

5. Writes Tests

Long gone are the days when engineers thought of testing as beneath them. After all, how can you be certain that your code is actually working if you never test it? An agile practice called Unit Testing has recently gained popularity because it focuses on writing tests to mirror the code. As the system grows, the body of tests grows with it, providing proof that the code actually works. Experienced engineers know and understand the value of tests, because their goal is to create a working system. Good engineers will always write a test once a bug has been exposed to make sure it does not come back again. But a good engineer also knows not to waste time writing trivial or redundant tests, instead focusing on testing the essential parts of each component.

6. Leverages Existing Code

Reinventing the wheel has always been one of the biggest problems in the software industry. From inventing new languages to rewriting libraries, the strange drive to ignore and redo what is already there and already works has been the cause of a lot of software failures. A rockstar engineer will focus on three essential kinds of reuse. First of all, the reuse of internal infrastructure, the code that he and his peers have written. Secondly, the use of third party libraries, for example, in Java, the libraries that are part of JDK or popular libraries provided by the Apache Foundation. And finally, a good engineer would look to leverage web-scale web service, like the ones offered by Amazon. Correct leveraging of existing infrastructure allows rockstar engineers to focus on what is most essential - the application itself.

7. Focuses on Usability

Good engineers always focus on the users. Whether the user is a business or an individual, whether the engineer works for a consumer software company or an investment bank, the focus is on working, usable software. How will users interact with the system? Does it provide a simple, intuitive, and smooth experience? The notion that because a software engineer is a techie, he or she thus can not relate to how other people interact with the system is deeply flawed. Good engineers work hard to make the system simple and usable. They think about customers all the time and do not try to invent convoluted stuff that can only be understood and appreciated by geeks.

8. Writes Maintainable Code

The other secret of good engineers is that it takes the same amount of time to write good code as it does to write bad code. A disciplined engineer thinks about the maintainability and evolution of the code from its first line. There is never any reason to write ugly code, a method that spawns multiple pages, or code with cryptic variable names. Rockstars write code which follows naming conventions, code which is compact, simple and not overly clever. Each line of code serves its purpose and resides in the right place. The bits that are difficult to understand are commented, but otherwise naming conventions are clear. Expressive names for methods and variables can make the code self-explanatory.

9. Can Code in Any Language

A good engineer might have a favorite programming language but is never religious about it. There are many great programming languages these days and to say that you only can code in one of them is to demonstrate a lack of versatility. In Java, C#, or C++ you can write any modern software. You can code the back end of any web site in PHP, in Perl, or in Ruby. At the end of the day, the language does not matter as much as the libraries that come with it. A good engineer knows that and is willing and able to learn new languages, new libraries and new ways of building systems.

10. Knows Basic Computer Science

The last, but certainly not the least trait of a great engineer is a solid foundation. A good engineer might not have a degree in computer science but must know the basics - data structures and algorithms. How can you build large scale software without knowing what a hashtable is? Or the difference between a linked list and an array? These are the basics that everyone should know. And the algorithms are just as important - from binary search to different sorts to graph traversals, a rockstar engineer must know and internalize the basics. These foundations are necessary to make the right design decisions when building any modern piece of software.

Conclusion

There are many traits that distinguish great software engineers. Among the ones we discussed, passion is certainly very important. Knowing the basics like code reuse, design patterns, fundamental data structures, and algorithms is necessary, while agile practices of refactoring and unit testing help engineers iteratively evolve complex software. Most importantly, rockstar engineers believe in simplicity and common sense. It is these beliefs that help them succeed in building the seemingly impossible, complex software systems that are necessary in today's world.

Let us know what other traits you think a rockstar software engineer should have, in the comments below.

]]>Discuss]]>
http://www.readwriteweb.com/archives/top_10_software_engineer_traits.php http://www.readwriteweb.com/archives/top_10_software_engineer_traits.php Trends Tue, 08 Apr 2008 00:50:46 -0800 Alex Iskold