Suvichara

Archive for the ‘Modeling-Simulation’ Category

Monte Carlo simulation: Predict how certain is uncertainty..

Posted by Prashant Hegde on August 17, 2007

Uncertainty is a fact of life. Uncertainty arises because of many reasons – incomplete knowledge about the reality, complexity, our limitation to predict future events, unforeseen major events etc. We still need to plan, execute and compete in the face of uncertainty. How can we say, then, the probability of success, the best case, worst case, average case estimates of our projects based on the current uncertainties? One of the things we can do is calculate the estimates separately for each of these cases. This is not only tedious, this does not allow asking questions like – what is the probability that the project will be completed within 2 years, 2.5 years, 3 years etc. You can think that a project can take many different paths due to the inherent uncertainties present. Due to this, it can take different time, cost( effort) etc for completion. We may be interested in knowing the spectrum of these time, cost etc variations to make better decisions. This is where the Monte Carlo simulation comes handy. It can walk through the different paths and generate nice graphs that show the time, cost etc distributions with probabilities.

Monte Carlo simulation models can simulate ‘reality’, help make predictions about the future outcome and help in making better decisions. It helps teams cope with uncertainties better. Monte Carlo models can walk through thousands of scenarios and generate predictions by taking randomness into account. Compare this with the difficult mathematical equations which are difficult to solve; some are even intractable. It does not require the users to be very proficient in mathematics. There are many commercial Monte Carlo simulation tools available.

The Monte Carlo simulation models require the inputs to posses a distribution rather than a single number. Generally, normal distribution, triangular distribution etc(also called parametric distributions) are quite popular. If you have historical data( also called non-parametric distributions), the tools will accept them instead of the parametric distributions. By providing the distributions instead if single point estimates we are incorporating randomness into our models. This approach is close to reality than the single-point estimates. How do you interpret the some of the properties of distributions? The mean ( also called the first moment) represents the most expected value. The variance ( also called the second moment) represents the risk associated, the third moment represents the distribution’s skewness and fourth moment measures the peakedness. The accuracy of the model depends on the level to which you have modeled and also the correctness of the data.

If you have data that are correlated you can also include them in the model and make the model know about it. Make sure you include the assumptions as part of the model. You can do sensitiveness analysis of the results on the assumptions. If the results are highly sensitive to one or more assumptions, you need to track the assumptions more closely. You can move them to risks if need be or spend some time to get more clarity on those assumptions.

Monte Carlo simulations can are also used in the robust design techniques. It uses the simulations to carry out the sensitivity analysis, robustness of the design etc. It helps in making the designs robust that can tolerate variations in the process, materials etc and ensures very high quality products.

Monte Carlo simulation tools are also used heavily in finance. It is used for risk analysis, forecasting, sensitivity analysis etc. It is also used by marketing, sales and other disciplines.

Monte Carlo simulation has a wide variety of application than the ones listed above. It is easy to use at the same highly useful.

As always, comments, different views etc are welcome!

Posted in Modeling-Simulation | Leave a Comment »

Simulation:Keep them simple, but no simpler…

Posted by Prashant Hegde on July 19, 2007

It is surprising that in the software industry, lot of managers believe that availability of tools can magically solve some of their pressing problems. They do not think about the competency needed to use the tool for solving those problems. They forget that – ‘fool with a tool is still a fool’!. This applies to many engineering tools, especially to modeling and simulation tools. Lot of organizations do think that these tools are a panacea. They invest heavily on these tools only to realize later that they have not made any progress!

One of the few things to be aware of simulations are – the results of simulation are only as good as the simulation model itself. One of the important things to be aware of is the correctness of input data used for the simulation. Put it another way – ‘garbage in, garbage out’!. The first thing to do while building a simulation model is to analyze the input data. Many times, data needs to collected from the real-world processes. These data are called are stochastic data. The data can be modeled as probability models. Once you collect data from the real-world processes, you need to come up with probability models for generating these data. There are many input analysis tools that give you the probability models and their parameters from the real-word or experimental data. The trick is to decide when you have enough data that can be fed to input data analysis tools! Once you have probability models for inputs, you can verify that they indeed have the same distribution as the real-world data by carrying out Goodness of Fit(GOF) test. Again, this is tricky. You need to know that you have enough data so that the test correctly tells you if they are drawn from the same distribution. Another way to verify this is – if after feeding the input data the simulation result should resemble the real-world process behavior. If not, you will have to suspect your input data. If no real-world data is available, then you can consult experts in the field about the correctness of your input data. Another factor that affects simulation output is the correctness of model itself.

There are so many input distributions possible: random data, uniform distribution, normal distribution, exponential distribution etc. One of the ways these plethora of distributions are handled is by using the Gamma distribution. Other distributions can be derived from this distribution as special cases.

It makes sense to keep the simulation model simple to start with. Once the basic model is running and the behavior is acceptable, start adding more and more details so that it is closer to the real-world process. The trick here is to model only the relevant information that affects the simulation output and omit irrelevant or less important details. The correctness of the simulation can be done via verification and validation. To quote Einstein – Keep the model simple, but no simpler. So, the philosophy here is – Seek simplicity Then distrust it. [Alfred North Whitehead].

Many simulation tools provide – hierarchical modeling capability. This feature allows the modeler to model the system at different levels of abstraction. The models themselves can act as basic modeling blocks for higher levels of abstraction. This feature not only makes models simpler to understand, but also makes them scalable and re-usable.

In many of the real-world problems we need to create models using many different paradigms for eg: continuous time, discrete time, discrete event etc. One of the few tools that allows ‘embedding’ of different modeling paradigms is Berkeley University’s: Ptolemy. OMnet++ is a popular open source network simulation tool. Visit here for a list of simulation tools.

Posted in Modeling-Simulation | 1 Comment »

UML Activity Diagrams 101

Posted by Prashant Hegde on June 23, 2007

One of the unique features offered by the Activity diagrams is – it makes possible to combine non-Object Oriented (referred to as OO) modeling with OO modeling. Most of the times than not, people use specialized non-OO libraries in their OO design. UML activity diagrams make modeling of non-OO library calls possible in their OO designs. Similarly, Activity diagrams can be quite useful for Business Process Modeling for enterprise modelers or for modeling the flow of information, control, energy etc for System Engineers. It can be used, for example, to graphically detail use cases. It can also be quite useful for project managers/system Engineers represent the project/product development process. People who are familiar with Petri nets, Data Flow, Control Flow Diagrams will find themselves at home with Activity Diagrams. Another important addition to Activity diagrams is the addition of parameter nodes (technically, they are object nodes) that allows the invoker to pass parameters to the model and receive output from the Activity. Activity diagrams can be invoked from other Activity diagrams thus making them reusable.

The activity model is defined with semantic variations, which means that the runtime behavior might vary from implementation to implementation and the modeler can choose the behavior that is appropriate. The basic constructs of Activity diagrams are – Action Nodes, Control Nodes (Decision nodes, Fork and Join, Initial Node, Finals Nodes) and Object nodes.

  • Action nodes represent behaviors, which can be – Activities, State machines or interaction diagrams. They consume control and data (referred to as tokens) and output control and/or data. Actions could be – create object, set object or invoke some behavior. Note that the actions can be represented as a function in procedural language like – C or as a method in OO languages, or as a class with behavior in OO languages. Actions are the only constructs in UML that can invoke operations on objects. Note that each action node may have more than one input (control and/or action). The action is triggered only when all the inputs are available. When the activity ends data/control is placed on the outgoing nodes (object node or control flow).
  • Control nodes are used for controlling data/control flow through the diagram based on decisions or provide parallel paths for data/control flow or terminating an activity.
    • An activity diagram can contain more than one start nodes. An activity will start at all these nodes simultaneously. So, it is better to have a single start node to avoid confusion.
    • Decision nodes help route flow to different paths in the diagram based on the evaluation of guard conditions placed on each path. Care should be taken not to have more than one guard condition evaluating to true at the same time. Another useful thing to remember is – guard conditions should not have any side effects. Decision nodes can be chained. Note also that decisions nodes can invoke other Activities to decide routing.
    • Merge Nodes are used to merge multiple flows. Just like the decision nodes merge nodes can be chained.
    • Fork Nodes split flows into more than one simultaneous flow. Data / Control are simply copied onto each path.
    • Join nodes synchronize multiple flows. Note that if there are multiple control tokens coming into the join nodes they will be merged into a single token. If there are both control and data tokens only data tokes are copied onto the outgoing flow. No control tokens are copied. Also, modelers should pay attention not to have any flow final nodes in one of the join paths, which would otherwise make the next action wait indefinitely if the flow is terminated. Like merge nodes join nodes can also be chained.
    • Flow final nodes receive control / data tokens and do nothing. Effectively, they end the flow in that path.
    • Final nodes end an activity when any token is received. This can be used instead of flow final as this node will terminate the entire activity including other concurrent paths.
  • Object nodes act as placeholders for data as it moves along the path. Object nodes can hold single token, multiple tokens, buffering, backup, central data store etc. The object nodes can optionally contain the state of the object. The object can also be a group of data. Object nodes can traverse the edge only when all the input-output pairs are ready. Central buffers can be used when it is necessary to allocate to token to different competing destinations.

Activities can be connected to other activities directly when they carry control or through object nodes when they carry data. As mentioned above Actions start as soon as all the inputs are available. However, streaming parameters are exceptions to this rule. Streaming parameters can accept and output parameters while an action is executing (at least one input must arrive for the action to begin). Apart from normal output parameters the action can also output exceptions as well indicating an exceptional condition. When an exception is output, other outputs are not defined. Note also that exception outputs cannot be streaming. It is also possible to group parameters (called Parameter Sets) such than exactly one of them accept or provide values. Constraints such as pre-conditions, post-conditions can be applied to actions, which can be specific to a particular invocation of a behavior or could apply to all invocations of that behavior. The other useful feature of Activity diagrams is the swim-lane notation. Swim lanes group activities. A swim lane for example could represent a department and its responsibilities or different hardware or different components or different classes or attributes etc.

For more information, interested readers may consult a series of 6 articles by Conrad Bock(Conrad Bock: UML 2 Activity and Action Models” )

Posted in Modeling-Simulation, Systems Engineering | Leave a Comment »

Statecharts and tools…

Posted by Prashant Hegde on June 11, 2007

Statechart(also called state machine) diagrams are one of the most powerful modeling constructs available to modelers. It is used to model reactive systems. By reactive, we mean, that part of the system that responds to external events or operations.

In UML, classes can have statecharts as part of their behavior. Modeling the reactive behavior using Statecharts makes designs more manageable, easy to debug, easy to maintain and, of course, more understandable. The CASE tools depend heavily on Statecharts for the code generation.

Rhapsody is one of the UML modeling tools that has provided extensive Statechart modeling support for modeling embedded systems. Rhapsody also provides an OS independent Object Execution Framework(OXF) implemented in C++ that provides base classes, an event loop for dispatching events, event queues for storing events etc. The modeler can concentrate on modeling the solution and not worry about its implementation. Rhapsody generates all the code for you!. There is also a another framework called IDF that implements the Statemachine in C and is suitable for systems where memory and performance are a concern. Rhapsody also provides animation support debugging state machine logic.

Apart from UML tools, there are other tools that support Statechart modeling. I am listing some of them here. Telelogic’s Statemate is a very popular too for modeling embedded systems. Mathworks’  Stateflow is another. Berkeley University’s Ptolemy is a popular simulation tool that supports Statechart modeling.

Quantum Leaps is a company that provides a framework for implementing Statecharts that is suitable for embedded systems. Boost is an open source C++ template library that provides a Statechart library.

Powered by ScribeFire.

Posted in Modeling-Simulation | Leave a Comment »

Implementing your own Domain Specific Language(DSL)

Posted by Prashant Hegde on June 7, 2007

DSLs are gaining importance again. Thanks to UML. UML did not live up to its promise. The problem with UML is- it is too generic and very vast. The semantics of UML is also not clearly defined. The vastness deterred people from learning it and lack of precise semantics made people frustrated. Lots of time and money has been spent in the industry on learning UML, UML tools. But, the usefulness of these tools still remains a question for many.

MDA(Model Driven Architecture) gave the promise that people just need to do only modeling and write no code. Code is generated from the model. People can ‘execute’ models, ‘debug’ models and ‘deploy’ models. Even though the concept looks good, there are lots of practical limitations in implementing this concept. There is hardly any tool in the market that fully implements MDA. One of the ways people build executable models today is by ‘writing’ code within models!

Realizing this, Microsoft started promoting DSLs. It is one of the important tools that is part of their Software Factory concept. DSLs are custom languages for a domain. For ex: regular expressions, which you might me familiar with, is a DSL that is used for matching patterns within a given text etc. Microsoft developed DSL Tools SDK that allows users develop their own DSLs. This DSL SDK is now part of Visual Studio. Apart from Visual Studio, you can use various open source tools for building your own DSL. Vanderbilt university’s – Generic Modeling Environment(GME) is a popular meta-modeling tool that can be used for building DSLs. Honeywell’s DoME is another free meta-modeling tool. Eclipse has limited support for building DSLs. MetaEdit+ is a popular commercial DSL tool from Metacase. You can also build text based DSLs based on Lex, Yacc, ANTLR etc.

Building your own DSL(s) is less complex, easy to learn and, also, better auto source-code generation is possible from them. You can build your own Language Workbenches( see article from Martin Fowler on this) using these DSLs for example…

Posted in Modeling-Simulation | 2 Comments »