SimBOINC: A Simulator for Desktop Grids and Volunteer Computing Systems
The SimBOINC project has been put on hold until further
to new recent major changes and developments with both BOINC and SimGrid. Thank you
for your understanding.SimBOINC is a simulator for heterogeneous and volatile desktop grids and volunteer computing systems. The goal of this project is to provide a simulator by which to test new scheduling strategies in BOINC, and other desktop and volunteer systems, in general. SimBOINC is based on the SimGrid simulation toolkit for simulating distributed and parallel systems, and uses SimGrid to simulate BOINC (in particular, the client CPU scheduler, and eventually the work fetch policy) by implementing a number of required functionalities.
SimBOINC simulates a client-server platform where multiple clients request work from a central server. In particular, we have implemented a client class that is based on the BOINC client, and uses (almost exactly) the client's CPU scheduler source code. The characteristics of client (for example, speed, project resource shares, and availability), of the workload (for example, the projects, the size of each task, and checkpoint frequency), and of the network connecting the client and server (for example, bandwidth and latency) can all be specified as simulation inputs. With those inputs, the simulator will execute and produce an output file that gives the values for a number of scheduler performance metrics, such as effective resource shares, and task deadline misses.
Request the SimBOINC source code by emailing dkondo a_t lri d_o_t fr. In there, you will find the SimGrid header files in the
include directory and the static library
libsimgrid.a in the
lib directory compiled on Linux 18.104.22.168 i686. You must change the install path in the Makefile (which is currently configured to use the static library) to point to the location of your simboinc directory. Afterwards, typing 'make', should compile everything and give you the executable
The source has been successfully compiled and run with gcc 4.0.1 on Mac OS X 10.4.7, and gcc 4.0.3 on Debian Linux 4.0.3-1. If you need to compile it on Windows, send me an email at dkondo a_t lri d_o_t fr, and I'll look into it.
SimBOINC expects the following inputs in the form of xml files:
platform file -- this specifies the hosts in the platform and the network connecting the hosts.
host availability trace files -- these are to be specified within the platform file.
workload file -- this specifies the jobs, i.e., projects, to be executed on the clients.
client states file -- this specifies the configuration of the BOINC clients
simulator file -- this specifies the configuration of the specific simulator execution
As an example, see the corresponding files mini_platform.xml, mini_workload.xml, mini_client_states.xml, mini_sim.xml in the main directory. To run SimBOINC, using these files, run:
The platform file is where one constructs the computing and network resources on which the BOINC client and server run. In particular, SimBOINC expects a set of cpu resources, and a set of network links that connect those resources. Moreover, SimBOINC expects the server to be named "Server" (as any other host in the specified in the file will run the BOINC client). For each resource, one can specify set of attributes. For example, with cpu resources, one can specify the power, and corresponding availability trace files. For network resources, one can specify their bandwidth and latency.
./simboinc ./mini_platform.xml ./mini_workload.xml ./mini_client_states.xml ./mini_sim.xml
Here is a small example from
<!DOCTYPE platform_description SYSTEM "surfxml.dtd">
<cpu name="Server" power="100"/>
<cpu name="Host_1" power="100" availability_file="mini_avail.txt" state_file="mini_fail.txt"/>
<cpu name="Host_1-SB-CPU1" power="100" availability_file="mini_avail.txt" state_file="mini_fail.txt"/>
<network_link name="0" bandwidth="100" latency=".001"/>
<network_link name="1" bandwidth="100" latency=".001"/>
<network_link name="loopback" bandwidth="100.00" latency="0.001"/>
<route src="Server" dst="Server"><route_element name="loopback"/></route>
<route src="Host_1" dst="Host_1"><route_element name="loopback"/></route>
<route src="Server" dst="Host_1">
<route src="Host_1" dst="Server">
We have a server named Server and a client named Host_1. Server and Host_1 are connected by network links with bandwidth of 100 and latency of 0.001. Also, the availability of Server and Host_1 is specified by the trace files
To create a multiple cpu host in the platform file, create a cpu with the basename, and then for each additional cpu, create another with the suffix "-SB-CPUN", where "N" is the cpu number and first cpu number is 0. In the above example, Host_1 is a dual-cpu host with cpu's "Host_1" and "Host_1-SB-CPU1". Data transfers are handled only through the "primary" cpu, which is this case is "Host_1", and so only the network information for the primary cpu is relevant and used in the simulation. While the platform file has an entry for each cpu, only individual hosts are represented in the client states file. For example, for the about platform, the client states file will only have a record for "Host_1", not "Host_1-SB-CPU1".
WE ASSUME THAT IF A CPU FAILS, THEN ALL CPUS IN THE SAME HOST FAIL AT THE SAME TIME. SO THE TRACE FILES FOR THE CPUS MUST BE IDENTICAL.
For more details on constructing a platform file, see here.
The workload file specifies the projects to be executed over the BOINC platform. In particular, it specifies for each project, the name, total number of tasks to execute, the task size in terms of computation, the task size in terms of communication, the checkpoint frequency for each task, and the delay_bound, and rsc_fpops_est BOINC task attributes.
Here is a small example from
The client states input file is based on the client states format exported by the BOINC client to store persistant state. For a BOINC developer, the meaning of the fields should be obvious. The idea is that client states files could be collected and assembled to produce a client_states input file to SimBOINC, which would allow the simulation of BOINC clients using realistic settings. WE ASSUME THE HOST_CPID IN THE CLIENT_STATES FILE IS THE HOST NAME FOR THE PRIMARY CPU IN THE PLATFORM FILE.
This simulation input file specifies the type of simulation to be conducted (e.g. BOINC), the maximum time for simulation after which the simulation will be terminated, and the output file name
mini_client_states.xml as an example.
min_sim.xml as an example:
In SimGrid, the availability of network and cpu resources can be specified through traces. For cpu resources, one specifies a cpu availability file that denotes the availability of the cpu as a percentage over time. Also, for the cpu, one specifies a failure file that indicates when the cpu fails. In SimGrid, a cpu failure causes all processing running on that cpu to terminate.
In BOINC, at least three things can cause an executing task to fail. First, the task could be preempted by the BOINC client because of the client scheduling policy. Second, the task could be preempted by the BOINC client because of user activity according to the user's preferences. Third, the host could fail (for example due to a machine crash or shutdown). In SimBOINC, the failures of a host specified in the cpu trace files represent the failure resulting from the latter two causes. That is, when a cpu fails as specified in the traces, all processes on the cpu will terminate. However, their state is maintained and persists through the failure so that when the host becomes available again, the processes will be restarted in the same state. That is, the tasks that had been executing before the failure are restarted from the last checkpoint after the failure, and the client state data structure is the same as before the failure.
SimBOINC uses the logging facility called XBT provided by SimGrid, which is similar in spirit to log4j (and in turn, log4cxx, and etc.) It allows for runtime configuration of messages output and the level of detail. However, it does yet support appenders.
We chose to use XBT instead the BOINC's message logger because XBT it integrated with SimGrid, and as such can show more informative messages by default (like the name of the process, the simulation time, and etc.).
To control the level of output for each logger, use
The simulator output file must be specified in the simulation input file. The simulator then outputs the following metrics to that file in xml:
for each client
for each project that the client participates in
total number of tasks completed
resource share and effective resource shared calculated by the using the cpu time for each completed task compared to the total
number and percentage of missed report deadlines for completed tasks
number and percentage of report deadlines met for completed tasks
Also, for each cpu specified in the platform.xml file, the simulator will output a corresponding .trace file, which records information about the execution of tasks on that cpu . In particular, the trace file shows in each column, the simulation time, the task name, the event (START, COMPLETED, CANCELLED, or FAILED), the cpu name, and completion time when applicable.
If you are already familiar with the BOINC client, then you should be able to jump right into the simulation source code without much trouble. The corresponding BOINC client source file in SimBOINC has the suffix "_sim" appended to the original file name.
If you want to understand the additional simulation classes that SimBOINC provides, you might take at look at the following classes:
We chose to implement the BOINC simulator using SimGrid for a number of reasons. First, SimGrid provides a number abstractions and tools that simplify the process of simulating of complex parallel and distributed systems. For example, SimGrid provides abstractions for processes, computing elements, network links, and etc. These abstractions and tools greatly simplified the implementation of the BOINC simulator. Second, we can leverage the proven accuracy of SimGrid's resource models. For example, SimGrid models allocation of network bandwidth among competing data transfers using a flow-based TCP model for networks that has been shown to be reasonably accurate. Using SimBOINC based on SimGrid, one could easily construct a network and simulate large peer-to-peer file transfers as a novel usage scenario for BOINC. Third, SimGrid was implemented in C and using it with BOINC's C++ source code is straightforward.
If you want to learn more about SimGrid, take a look here.
When transplanting BOINC code into SimBOINC, we tried to make as few modifications as possible to the original boinc cpu scheduler source code. Nevertheless, some changes had to be made because we are of course running a simulation. In most cases where BOINC code was left out, it was just commented out so that BOINC developers can see what has been removed. Here is a list that summarizes the changes:
gstate is now a pointer to a CLIENT_STATE object. To avoid pointer dereferencing and a change in syntax in the cpu scheduler,
gstate is a reference to a CLIENT STATE object in the PROJECT class used in
Static variables that have file scope (in particular,
cpu_sched.C) have been changed to member variables. This is because we create multiple client objects in simulation, and thus need multiple instances of that variable.
We don't consider non-cpu intensive projects
OS specific code was removed (e.g. things like process creation, calling execv versus CreateProcess in Windows.)
SimBOINC uses the logging facility called XBT provided by SimGrid instead of the BOINC logger (see above for an explanation)
Here's is the TO DO list where items are listed from highest to lowest priority:
What version of BOINC is the simulator in sync with? BOINC 5.5.11
Is the simulator useful for studying and evaluating the client-side CPU scheduler?
The current simulator *can* simulate a single client that downloads workunits from multiple projects and use its cpu scheduler to decide when to schedule each workunit.
The server in SimBOINC is different from the typical BOINC server in that there is one server for multiple projects, and so requests for work from multiple projects are channeled to a single server. The server consists of a request_handler that basically uses work_req_seconds and project_id parameters sent in the scheduler_request to determine the amount of work from a specific project to send to a client.
We understand that for testing new work-fetch policies and cpu schedulers, only a single client that work downloads for multiple projects is needed. But we wanted SimBOINC to be a general purpose volunteer computing simulator that could simulate new uses of BOINC by different kinds of applications. For example, people should be able to use SimBOINC to simulate the scheduling of low-latency jobs or for simulating large peer-to-peer file distribution; in both these cases, simulating multiple clients would be essential.
Is a Windows version available?
We're working on it.
Derrick Kondo is the developer of SimBOINC, which is based on the BOINC project and SimGrid.
The work fetch policy used is out-of-date and needs to be updated
Result Output Uploads
Create a web interface to simulator so that users can test specific scheduling scenarios.
David Anderson is the leader and developer of the BOINC project.
Arnaud Legrand and Martin Quinson are currently the primary developers of SimGrid.
Generated on Mon Mar 12 16:21:01 2007 for SimBOINC by