cs527: October 2009

Thursday, October 22, 2009

Armstrong thesis chapter 4

How did Erlang's view of the world help with implementing an HTTP server?

In Erlang everything is a process and processes can only interact by exchanging messages. So to implement a HTTP server, the main loop of the server code create a new process for each connection, then listen for client’s requests and respond.

The author describes an interesting concept of handling errors, which is based on the hardware duplication approach of handling failures. In order to provide high reliability, many systems have redundant hardware so that if one fails the other can recover it and keep the system running. This idea is used and implemented in Erlang. The processes are linked in the event that one fails the other is notified of the failure with an explanation so that appropriate actions can be taken. The failing process sends a message to the other process before it terminate. If there is a hardware failure on the machine on which one process runs, the hardware failure is converted to look like a software failure. This allows a uniform error recovery mechanism to be used for both hardware and software failures.

Another interesting concept is the notion of completely separate error-handling code from the code solving the problem and have them run in different threads of executions. The main advantage on this approach is that the code can be developed in a single node system and ported to a distributed environment with only minor code changes.

How important is it to use intentional programming when it comes to maintenance? (The author gave the example of history of theDictionary API)

I think this is the central idea behind writing code that is easy to maintain. The function’s name and variable should be meaningful, descriptive and clearly express the function’s purpose.

Dense Linear Algebra, Graph Algorithms, Monte Carlo

Do you know these patterns, or are they new to you? If they are new,could you understand them? What questions did you have? What couldthe authors have done (more pictures, more examples, more definitions)to help you get it?

I know and have used most of the ideas presented, but I did not know them as patterns. Considering linear algebra, graph algorithm and Monte Carlo methosd as patterns seem strange to me. These are three huge areas of studies and there are generally complete course (e.g. graph theory) to study these problems. There are many different graph algorithms and the type of problems that can be modeled to them is huge. Maybe a pattern language for graph algorithm is more appropriate.

I have taken classes in linear algebra and algorithm where we discussed graph algorithms and Monte Carlos methods. So it was straightforward read for me. Overall I think they did a great job in describing the patterns. For the linear algebra pattern, the figures to illustrate the matrix multiplication and memory hierarchy are helpful in understanding various approaches when starting from scratch. Graph algorithm and Monte Carlo methods include enough details to understand.

How are computational patterns different from structural patterns? Is there a clear distinction, or a fuzzy one?

The only thing that I can see is that structural patterns define the structure(the steps) to use to solve a problem (e.g. a software design problem) and computational patterns actually solve the computational problems.

Wednesday, October 21, 2009

Armstrong thesis chapter 2

This chapter makes some simple and very intuitive points in defining architecture. An architecture is basically composed on 4 parts: A problem domain which state the problem that the architecture is designed to solve, a philosophy which is the ideas behind the architecture being developed, a set of construction guidelines and predefined components that can be use to avoid designing from scratch
The author uses these 4 steps in developing architecture for fault tolerant systems. Fault tolerant systems as defined in the thesis are software systems that behave reasonably in case of errors. They have a hierarchical structure consisting of multiple levels. The complexity of tasks performed at each level increases with the levels. So the top level performs the most complex task. To achieve fault tolerance the author emphasizes strong encapsulation to prevent errors in one part of the system from affecting other parts. Error isolation is the key here and is the main characteristics of COPL . The processes running on the machine must be isolated so that an error in one process does not affect another process.
Another point That I found interesting is the notion of “fast fail” where a process stops in case of an error. The idea is that the process should either work properly or should signal the failure and stop. I think this is a good idea and can be implemented for some systems. However it is kind of counterintuitive to the notion of fault tolerance. It seems more logical that such systems should try to recover from failure especially given the fact that failures are part of most software systems.

Tuesday, October 20, 2009

Event based / Map reduce

Map reduce

Map reduce is a pattern based on the map and reduce functions usually found in functional languages. The idea is very simple. The map function is applied in parallel to each object in a set, then the results are collected and combined by the reduce function to get the output. This pattern is great for applicable problems and another advantage come from the fact that when incorporated into a framework, the programmer only need to provide the map and reduce functions. One of the aspects of this pattern that we are currently looking at in one of our cloud computing project is to incorporate a reschedule function. Basically when the operations are been carried out independently on different servers, if a map or a reduce fails we want to reschedule the failed operation without having to reschedule all the other tasks.

Event based

This pattern is similar to the observer pattern. The system is composed of components that can post events (announcers) or listen to events (listeners) and a medium which is the central piece of the solution acting as the liaison between the listeners and announcers. The medium dispatches the received events to affected listeners and allows listener to register and listen to events of their choice. I think the main difference with the observer pattern is the presence of the medium. In the observer pattern the subject maintains a list of its observers and directly notifies them of state changes without passing through an intermediate medium.

Thursday, October 15, 2009

Pipes & Filters,Layered systems, iterative refinement

How do the two repeats differ from the first versions that you read?
I think the difference is that they are presented here to show how to exploit task parallelism.

Did they miss anything?
I am not sure if there is a reason why they did not include any picture to illustrate the concepts. It makes things simpler to follow.

Did they include something that the first versions didn't?
Pictures would have made their points easy to follow, but they did do a good job at describing the patterns in steps. The analogy
with the task graph for pipes and filters was very good.

Did you learn anything from them?
I did not learn anything new from the first 2 patterns as I already knew them. The third pattern was confusing to me.

Doyou have any advice to the authors of these patterns?
They should includes some pictures and/or graphs to facilitate learning

For the new pattern, have you seen programs that used it? If so, doyou have a good story to tell about it? What was hardest tounderstand about this pattern?
I probably used the idea behind this pattern, but the description is not very clear to me . It is like some kind of loop unrolling where each iteration
is considered independently.

Tuesday, October 13, 2009

Rereading the classics: BA ch 14

This chapter is more like a summary of the beautiful architecture book. It has been an enjoyable and rewarding experience reading this book. I have learned so much from some of the greatest architectures and thought a great deal about the architectures that were not so great. One the lesson that I learned is that there is no such thing as the perfect architecture for a system. It is important to look at what exist and how people have solved similar problems before embarking on a design. Patterns are great and should be applied where applicable. This book emphasizes the importance of good architecture and as software builders we need to know that a good design is vital to the success of the product at all stages (development and support)

This chapter talks about Smalltalk, which is an interesting programming language. I did not learn much new thing in this chapter because I used Smalltalk on CS598. An interesting aspect of Smalltalk is that it is a pure object oriented language, where everything is an object and objects interact with each other only through messages. Another aspect is that the programmer must work in the smalltalk environment with the set of class libraries provided. This makes smalltalk less flexible and it is one of the reasons why it did not become very popular even if many of the concepts of smalltalk were adopted by other languages.
The conclusion of this chapter, which also like a conclusion to the book remind us again of our task as programmers. Just as programming, architecture is a matter of practice. We need to build systems that are not only beautiful, but that work. The very last sentence of the book states that very clearly: “Architecture is a chaotic adventure because beautiful architecture alone is not enough; not only beauty, but also usefulness, is the law for architecture and programming alike”

Refactoring for Reentrancy

This paper describes a concept that I have been hearing a lot lately at work as we are thinking about porting a sequential firmware we have to a new multicore platform that is been developed by our hardware group. A program is reentrant if distinct executions of the program on distinct inputs cannot affect each other,whether run sequentially or concurrently. This has the great advantage that reentrant programs can be ported to multicore machine without the need of concurrency control or the need of running each program instance on a separate virtual machine.
The authors argue that manually converting legacy single threaded programs to reentrant programs is a perilous and labor-intensive task because these programs use global states in a non reentrant way. So to help the programmer transform the code to a reentrant program, the authors proposed and developped a java automated refactoring tool that replaces global mutable state with thread-local state. This allows the execution thread to get a separate copy of each global variable. When the tool introduces the thread-local state, the programmer only needs to manually modify the code to use a new thread for each execution instance of the program to make the program reentrant.

Clearly this is a great tool for java and for those who use eclipse, but I think a generic tool performing this task will be great. We do not use java and eclipse for our firmware development, but I can see how helpful it will be to have such a tool.

Thursday, October 8, 2009

Software architecture: OO versus functional BA ch 13

In this chapter, the author compares the functional programming style with the object oriented style. It is probably not a surprise to any one that he comes to the conclusion that OO style is the better approach because it provides higher level abstraction and more extensible and reusable.
OO style is the most widely used programming model. Thinking in terms of objects allows for greater abstraction in the design and OO techniques like inheritance and polymorphism are powerful for creating extensible and reusable code. This makes object-Oriented environment more manageable especially as the system grows towards the enterprise level.

Functional model is good when the program does not need to remember the states and when only the final result matters. It is great for mathematics as the program can be written almost exactly as the mathematic expression. I used ML in a compiler class and it was quite impressive to see how one line of ML program can compute very complex matrix operations. The corresponding function to achieve the same task in a language like java would take many lines of code.

I think functional languages are very hard to learn for people who started with object oriented programming style. OO style is very similar to the way people think and it is very intuitive compared to the functional style.

ReLooper: Refactoring for Loop Parallelism

As the previous paper on Concurrencer, this paper describes a refactoring technique aimed at converting sequential code to parallel programs. The target for Relooper is transforming an array to a ParallelArray. A ParallelArray is a special array in java that supports parallel operations. The authors make the point that many programmers prefer to use refactoring to incrementally covert their code from sequential to parallel programs. Compared to a complete re-designed for parallelism, the refactoring approach is considered safer and allow to maintain a working version of the code while the refactorings are being performed.

ReLooper efficiently analyses whether the loops can be parallelized and replace them with equivalent parallel operations. I think the tool is helpful, but it will be good to see how all these tools designed to help programmers parallelize their programs work together. It seems like there are many tools out there each targeting a particular type of refactoring for parallelism. I am not sure if enough tests have been done to proof the correctness of these tools. I do like the fact that ReLooper does a static analysis on the code before it performs the transformations and warn the user about conflicting memory accesses and I/O operations.

Tuesday, October 6, 2009

When the bazaar sets out to build cathedrals BA ch 12

Author mentioned some key technical decisions like dropping the idea if it doesn’t work in reasonable time frame; maintain the core functionality working all the time while the changes have been made to the project, etc… What are some other ideas that you get in mind while you are reading these technical decisions or some other ideas that we may have seen in cathedral type project building (in our work places)?

In software development there are always difficult decisions that have to be made with various consequences. As described in this chapter larger scale software projects are not only technically challenging, but also involve other issues. Open sources project need to overcome technical, social and structural issues, while commercial software development teams usually need to deal with aggressive schedules, reduced costs.
The decision of completely redesign KDE from KDE3 to KDE4 was a courageous and risky decision. I believe such a decision was only possible because of the open source status the project. Given the large number of contributors they were able to maintain KDE3 and quickly redesign the system. The decision was needed to make KDE platform independent and ended up being a successful decision as it made the platform more stable and brought more users.
From my experience it is not easy to take a redesign decision in commercial environment. I have worked on couple of project that clearly needed a complete re-design. In one case, the system was very messy and very hard to maintain. Many developers argued to the management that a redesign was necessary and will pay off in the long run, but senior management completely opposed the idea because it was critical to bring new functionality to the market and the redesign was just too much of a risk to take. After reading the Evolution of Akonadi, KDEPIM community depended on architecture meetings / conference secessions to discuss the key architecture items and author mentioned about most of the high level / key items are agreed but implementation of individual items still faced some heated discussions. Have you had any similar experience where you agreed to key concept but varied how it needs to be implemented either in bazaar / cathedral model implementation of the project?

This happened on almost all the projects I have worked on in my current position. We develop firmware for a system where reliability and performance are critical. We regularly have heated discussions in low level designs and code reviews about implementations and code efficiency. Currently we are developing code of conducts and code guidelines so that we can refer to them for conflict resolution in the future. I like the section where the authors talks about how some of decisions are made in KDE. They talked about: “those who do the work decide”. I argue that some of us working in commercial companies can only dream of such an environment. As people who do the work we are usually not even in the room when decisions are made. Senior executives make the decisions usually based on market conditions, set the date the product needs to be ready and push the decision down to development teams for implementation.

Monday, October 5, 2009

Refactoring sequential Java code for concurrency via concurrent libraries

I truly enjoyed reading this paper because it is simple, well organized and easy to follow. The authors describe a rafactoring tool that enables developers to convert sequential java code to parallel code using the java.util.concurrent(j.u.c) package. The j.u.c package provides utility classes for concurrent programming. To use the tool, the programmer selects the target piece of code, and applies the appropriate refactoring. This allows the program to be incrementally transformed to a parallel program using a series of refactorings under the control of the programmer. While not perfect yet, I think this tool has a great advantage over manual refactoring. The paper presents pretty convincing details how the tool achieved better results compared to manual refactorings done on the same projects.
The tool is not completely automated as the programmer needs to select a piece of code and target it with the rafactorings. This still leave room for omission. A fully automated tool will probably find more areas suited for parallelism than a programmer. So, logical extensions opportunities for the Concurrencer include fully automation and the support for additional refactorings.

Parallel refactoring tools are great, but for some cases, it is probably more difficult to retrofit concurrency into a sequential program than to re-design for parallelism. If a team has the time and the resources I believe they should redesign. It will be easier and more parallelism will be achieved that way instead of being constrained by a sequential design that was done without taking into account parallelism.

Thursday, October 1, 2009

A Java Fork/Join Framework

This paper describes the design and implementation of Fork/join framework that support a parallel solution to the classical divide and conquer style problems. Those problems can be divided into subtasks that are solved in parallel and the subtask's results combined at the end to produce the final result. I believe this is the most basic and intuitive type of parallel programming. The authors claim that the framework is efficient, support scalable parallel processing and provide a simple API to the programmer. However I think that little details on the performance analysis and scalability test was done only on a single JVM. So It is hard to tell how well this system really perform and how scalable it is. I think this is a weakness for this paper because such a work should have focused on showing how this framework handles more efficiently Fork/join applications compared to traditional thread frameworks who have more overhead than required by Fork/join.

GNU/EMACS: Creeping featurism is a strength BA ch 11

I am a fan of Emacs and I have been using it for years, so a lot of the features described in this chapter were not a surprise to me. Emacs is very powerful. In my team we always juke around: Old folks saying “you kids with Emacs” while the younger team members say “you old folks with VI”. I am not sure if other people have seen the same trend.

I will comment on the questions:

* Is it possible for a system like Emacs to be created in a non-open source way?
I think that everything is possible if the money is invested, but it will probably take a huge investment to do it. Emacs has benefited a lot from the contribution of the community. Developing such an extensible and customizable editor is not an easy task * What are some of the disadvantages of a system like Emacs? I don’t see any for people who take the time to learn and discover the features. It can take a while to assimilate all the important features. Like any free software, there is always the risk of not having support.

* What architecture decisions have allowed Emacs to grow like it has?
The decision of making it highly extensible and customizable.

* When is avoiding complexity a good/bad argument for implementation? It is good to avoid complexity if the added complexity does not add any value necessary for the project. For example if performance is critical to a project, then there will be some complexity added to get the performance out of the system.

* Do you think Firefox will replace Emacs?
I do not see how that is possible. They are 2 different things

cs527