CIMA
CIMA is the
CMS Instrument for Masterclass Analysis, a special piece of PHP software used for CMS Masterclasses. It's not part of the CMS e-Lab, but it is web-accessible and included with the website code.
CIMA in Use
During a CMS Masterclass, participants use the iSpy event display to view a series of CMS particle collision events while using CIMA to record and graph their observations. The Masterclass proctors guide participants through identifying the set of leptons (e or μ) visible in the detector and then using this final state to infer the unseen primary state boson that produced it (Z, W, Higgs, etc.). After examining multiple events, participants plot the invariant mass of Z-type events into a histogram; if all goes as intended, they will see a peak at the Z mass. The experience illustrates how fundamental particles are observed and measured using accelerator data.
Masterclass Session and Locations
Masterclass sessions may be hosted from anywhere in the world, and they generally involve multiple groups from multiple locations joined via video link. Participants are typically high school students (or the local equivalent) and their teachers.
On the front page (
index.php), participants first choose their Masterclass session from the "Choose your Masterclass" column. For example,
CERN-10Mar2017
would refer to a CMS Masterclass hosted from CERN on March 10, 2017. Once selected, users will then see the "Choose your location" column. Although it probably no longer exists as you read this,
CERN-10Mar2017
had groups of participants in Debrecen (Hungary), Lyon (France), Palaiseau (France), São Paulo
SPRACE (Brazil) and Zagreb (Croatia). Users select the appropriate option for their location (
Debrecen2017
,
LyonB2017
, etc.) from this column.
Data Groups and Events
As of Q1 2017, CIMA and iSpy use 10,000 CMS detector events divided into 100 groups of 100 events each. After choosing their location, users will then be able to see which data groups are assigned to them. Selecting a group opens
fillOut.php, which is where much of the work is done during the Masterclass.
During the session, participants use the iSpy event display (accessible through a link in the upper-right of the page) to work their way through the 100 events in the data group that they've selected. iSpy maintains the same group/event numbering system as CIMA so that users can select the same data event in both iSpy and the particle selection table of
fillOut.php
. iSpy shows a transparent model of the CMS detector along with a reconstruction of the tracks observed in the selected event.
The most salient feature of the collision shown in the event display are the electron and muon tracks. Users identify which type of lepton appears in a given event according to their paths and which part of the detector they appear in, and they select it from the "final state" section of the CIMA particle selection panel. Next, users infer what type of primary state boson decayed into this observed final state using various physical principles. For example, charge conservation dictates that charged bosons (W±) can decay only into odd numbers of electrons or muons (or their antiparticles), while neutral bosons (Z, Higgs, etc.) must decay evenly into lepton/antilepton pairs. Users select their choice from the "primary state" section of the CIMA particle selection panel.
Particle Selection
The options given for "primary state" are W+, W-, W, NP, Higgs, and Zoo.
- W+ and W- should be self-explanatory
- W is selected when a user identifies the primary boson as a W but cannot determine the sign of its charge
- NP is selected to indicate a neutral particle: typically a Z, and it used to be "Z," but we changed it to NP to account for the fact that the data includes J/Ψ and Υ events, which are chargeless particles whose decays are similar to Z's at this level of observation
- Higgs is selected if the user believes the event shows the decay of a Higgs boson
- Zoo is selected when the user cannot confidently identify the source of the decay.
Higgs and Zoo are marked as "special" events on the panel, and selecting them will disable any selection of the final lepton state. I've never been entirely sure why (Joel).
If a Z-type neutral particle is selected, the user will be asked to determine the invariant mass of the decay from the information provided by the iSpy event display and then enter it into the input box in the particle selection panel.
Once the user submits an event through the particle selection panel, the event and the user's selections appear in a table below, and the selection panel cycles to the next event. If the user selected NP as the primary state, the entered mass will appear in the table for that event. If the user selected Higgs, the invariant mass of the event is taken from the database and displayed without user input (
even if it's not really a Higgs event! Users are allowed to make mistakes). No other event types display a mass in the table, though all events have an associated mass in the database.
Results and Histogram
After users are done analyzing individual events, the header of the event analysis page (
fillOut.php
) offers links to two tools for interpreting their work: a tabulated results page and a histogram plot.
The results table on
results.php
shows the results of
all data groups that
all locations analyzed during the Masterclass so that users can see what their co-participants in different parts of the world found using other data groups. The histogram on
hist.php
is an interactive feature that lets users manually enter the masses of Z-type primary states that they found in their data group; like the results table, the histogram captures input from all locations, which allows participants to help construct a better graph than their single location could by itself. The end result is a histogram with sufficient data to show a clear peak at the Z-mass.
The Source Code
In the repository, the code is found in the
cima/
folder of the repo root. On the VMs, its live files are served from the directory
/home/quarkcat/sw/www-php/cima/
.
NB this last fact! Since CIMA is PHP and not JSP, it is not served from the same Tomcat directory as the rest of the site, which is
/var/lib/tomcatX/webapps/elab/
(AKA
quarkcat/sw/tomcat/webapps/elab/
). This can be confusing since the CIMA home page URL
www.i2u2.org/elab/cms/cima/index.php suggests that the directory
tomcat/webapps/elab/cms/cima/
ought to exist, and indeed it does - but it's unused and contains no files. It's probably deletable. We should maybe consider that.
CIMA is not deployed
The deployment scripts
deploy-from-svn
et al. do not affect CIMA files. To place CIMA files into service on either
i2u2-prod or
i2u2-dev, you must manually copy files into the
/home/quarkcat/sw/www-php/cima/
directory of the VM (the same is true of the other
www-*/
directories, by the way).
Even though CIMA is not deployed from the repository, you should still commit changes you make to the source code to the repo for version control and group access.
Ideally, the fileset in the repository branch directory
4.0-ND-dev/cima/
would correspond exactly to the
i2u2-dev directory
i2u2-dev:/home/quarkcat/sw/www-php/cima/
, and
4.0-ND-prod/cima/
to
i2u2-prod:/home/quarkcat/sw/www-php/cima/
, but over time differences have accumulated between the VM files and their respective repo files. This is the state as of Q1 2017, at least.
The Data
Database Tables
CIMA data is stored on
i2u2-db on a MySQL database called
Masterclass
. The vast majority of tables in
Masterclass
are "Location" tables (see
Location tables, below). In March of 2017, for example, there are 481 tables, 11 of which are NOT Location tables. They are:
mysql> SHOW TABLES FROM Masterclass WHERE `Tables_in_Masterclass`
NOT IN (SELECT `name` FROM `Masterclass`.`Tables`);
+------------------------+
| Tables_in_Masterclass |
+------------------------+
| EventTables |
| Events |
| EventsExt |
| Events_Backup27Feb2017 |
| Events_New |
| Events_Old |
| MclassEvents |
| TableGroups |
| Tables |
| groupConnect |
| histograms |
+------------------------+
The tables
EventsExt
,
Events_Backup27Feb2017
,
Events_New
, and
Events_Old
are backup or auxiliary tables created during the Feb2017 CIMA upgrade; they may be deleted in the future. That leaves seven tables for you to be familiar with individually.
Events
The most important table is
Events
, which is the master list of all 10,000 particle events used in CIMA. It has the form
+-------+------+---------+------------+-------------+
| o_no | g_no | g_index | ev_no | mass |
+-------+------+---------+------------+-------------+
| 1 | 1 | 1 | 490868544 | 75.6802 |
| 2 | 1 | 2 | 489963747 | 59.0754 |
| 3 | 1 | 3 | 329045512 | 70.5787 |
| 4 | 1 | 4 | 328573895 | 81.5894 |
| 5 | 1 | 5 | 75779415 | 90.3327 |
...
| 100 | 1 | 100 | 490570312 | 64.9327 |
| 101 | 2 | 1 | 39338918 | 10.2441 |
| 102 | 2 | 2 | 329158332 | 75.3631 |
| 103 | 2 | 3 | 70443694 | 93.785 |
| 104 | 2 | 4 | 77255513 | 78.8601 |
| 105 | 2 | 5 | 328781228 | 81.2225 |
...
| 9995 | 100 | 95 | 1764877904 | 86.5181 |
| 9996 | 100 | 96 | 200025102 | 9.83702 |
| 9997 | 100 | 97 | 1460456769 | 91.212 |
| 9998 | 100 | 98 | 254964165 | 83.4258 |
| 9999 | 100 | 99 | 1765859249 | 69.2061 |
| 10000 | 100 | 100 | 95312939 | 10.8871 |
+-------+------+---------+------------+-------------+
The primary key is
o_no
, which ranges from 1 to 10000. This index uniquely identifies every event used in CIMA. The data group that a given event has been assigned to is given by
g_no
, which ranges from 1 to 100. Each data group therefore has 100 events; the index of an event
within its group is given by
g_index
, which ranges from 1 to 100. The
ev_no
identifier is something used by the CMS experiment, and it isn't used at all in CIMA as far as I can tell. The invariant mass associated with the event is given by
mass
.
The
g_index
column was added by Joel in Feb2017. It isn't fully implemented within the code, which often uses
ad-hoc formulae to extract the group index from
o_no
and
g_no
. Doing so is part of the upgrades indicated by the keyword
TASMANIA
in the code's comments.
MclassEvents
The table
MclassEvents
is of the form
+-----+------------------------------+--------+
| id | name | active |
+-----+------------------------------+--------+
| 8 | Test2 | 0 |
| 11 | 31Jan2015 | 0 |
| 12 | 10Feb2015 | 0 |
| 14 | 01Jan2015(orientations) | 0 |
| 15 | 04Mar2015 | 0 |
| 16 | 09Feb2015 | 0 |
| 17 | Fermilab-06Mar2015 | 0 |
| 18 | Fermilab-07Mar2015-14CT | 0 |
...
| 160 | Mayaguez-25Feb2017 | 1 |
| 161 | Orientations2017 | 1 |
| 162 | CERN-04Mar2017 | 1 |
| 163 | CERN-08Mar2017 | 1 |
| 164 | CERN-10Mar2017 | 1 |
| 165 | CERN-14Mar2017 | 1 |
...
This contains the names of Masterclass events. In this context, "event" refers to a Masterclass session, not to an accelerator collision event. I (Joel) presume that the name of every Masterclass session, past and present, is stored here except for a handful near the beginning that seem to have been manually deleted. The
id
value is used as a cross-reference with other tables; it functions as a primary key. The
active
value is a boolean that determines whether or not the given session appears in the selection menu on the CIMA front page.
The
name
value of the
MclassEvents
table is referenced within the CIMA source code as
$_SESSION["Masterclass"]
.
Tables
Each of the Masterclass sessions identified in
MclassEvents
includes participants from multiple locations. Each location is given its own table in the database with a name chosen by the administrator. The names of these Location tables are stored in the
Tables
database, which has the form
+-----+------------------------------+------+
| id | name | hist |
+-----+------------------------------+------+
| 209 | 17July | 214 |
| 426 | 20Feb2017-test1 | 431 |
| 427 | 20Feb2017-test1a | 432 |
| 428 | 20Feb2017-test1b | 433 |
| 210 | 23July | 215 |
| 117 | Aachen | 122 |
| 236 | Aachen2016 | 241 |
| 502 | Aachen2017 | 507 |
...
| 433 | uprm-tchrs | 438 |
| 434 | uprm-tchrs2 | 439 |
...
| 246 | ZagrebA2016 | 251 |
| 498 | ZagrebA2017 | 503 |
| 293 | ZagrebB2016 | 298 |
| 499 | ZagrebB2017 | 504 |
| 113 | Zagreb_2 | 118 |
| 235 | Zilina2016 | 240 |
| 68 | Zurich | 73 |
| 332 | Zurich2016 | 337 |
| 477 | Zurich2017 | 482 |
+-----+------------------------------+------+
For example, the Masterclass session hosted from Mayagüez, Puerto Rico in February of 2017 is identified in the
MclassEvents
table above with the
name
"Mayaguez-25Feb2017" and the
id
160. This session included two groups of people at the University of Puerto Rico at Mayagüez who were assigned the tables
uprm-tchrs
and
uprm-tchrs2
in the
Masterclass
database. The names of these tables are shown as they appear in the
Tables
table above.
The
id
value functions as a primary key for this table. There's about 38 missing, probably manually deleted. The
hist
value likely refers to a table that records each Location's contributions to the Masterclass's histogram.
Location tables
The Location tables whose names are stored in
Tables
record how users at each location analyze the data they're assigned. For example, the Location table
uprm-tchrs
used during the
Mayaguez-25Feb2017
Masterclass has the form
+------+-------------+
| o_no | checked |
+------+-------------+
| 1 | mu;W- |
| 2 | mu;W+ |
| 3 | e;W- |
| 8 | mu;NP;90.33 |
| 7 | H |
| 5 | H |
| 10 | H |
| 6 | H |
| 301 | e;W+ |
| 101 | mu;NP;10.29 |
| 901 | mu;NP;93.05 |
| 201 | e;W |
| 102 | e;W |
| 902 | mu;W+ |
| 302 | mu;NP;36.08 |
| 903 | mu;W- |
...
Every time a user presses the "Submit" button on the particle selection panel of
fillOut.php
, this table is updated to record which particles' checkboxes were selected and what mass was entered (if applicable). The
o_no
value is the unique event index given in the
Events
table, while the
checked
value is a string that encodes what information users submitted through the particle selection panel of
fillOut.php
.
This example is actually a bad one: at the time this table was created, the group indices for the newly-imported 10,000 data events were not being properly assigned. A typical Location table should contain
o_no
from within a single range of 100 events assigned to a given data group. That is, the
o_no
values recorded here should be within a range like (1-100), (1401-1500), (9801-9900), etc.
The value
checked
was originally intended to be particle checkboxes only, but in Feb2017 Joel added user-submitted masses to the string as the quickest way to implement that feature. More properly, these tables should instead be created with a separate
mass
column to store this number. Doing so is part of the upgrades indicated by the keyword
TASMANIA
in the code's comments.
The name of a given Location table is referenced within the CIMA source code as
$_SESSION["database"]
. Data within the table is usually (but not always) accessed as the variables
$events["id"] = o_no;
$events["checked"] = checked;
Importing Event Data
CIMA's event data originates with Tom McCauley, the maintainer of iSpy, who selects events from publicly-available CMS data for use with the CMS Masterclasses.
For the Q1 2017 upgrade, Tom provided CSV files of data from these events for import into the CIMA database. For two examples,
masterclass_1-2gamma.csv:
Run,Event,pt1,eta1,phi1,pt2,eta2,phi2,M,Index
199319,641436592,77.2006,0.250438,0.6055050000000001,60.1382,0.650821,-1.5390000000000001,122.79790133899999,97
masterclass_60-4lepton.csv:
Event,Run,E1,px1,py1,pz1,pt1,eta1,phi1,Q1,E2,px2,py2,pz2,pt2,eta2,phi2,Q2,E3,px3,py3,pz3,pt3,eta3,phi3,Q3,E4,px4,py4,pz4,pt4,eta4,phi4,Q4,M,Index
137440354,195099,92.5961775474,8.5353921252,-22.575752798699998,-89.39532085489999,24.1354,-2.02028,-1.20933,-1,59.8124628499,-10.7217014151,41.810988378699996,-41.4053850271,43.1638,-0.8522719999999999,1.82182,1,21.4101492687,6.953341653760001,-20.2443480232,0.460330755882,21.4052,0.021503900000000003,-1.23995,1,11.013022368900002,-7.746923757160001,6.874656868580001,-3.74311724054,10.3574,-0.353958,2.41578,-1,127.047752639,96
The filename contains the data group (1-100) of the enclosed events immediately after the underscore, along with the physical event type (2gamma, 4lepton, etc.). Each physical event type has a different number and structure of CSV columns, but only "Event," "M" or "Mt," and "Index" are relevant to CIMA.
"Event" is the
ev_no
column value of
Masterclass.Events
, the unique CMS identifier for the event that CIMA doesn't really use, but we record it anyway.
"M" is the
mass
column value of
Masterclass.Events
, the invariant mass of the decay. Some physical event types have a
transverse mass "Mt" instead, which is not the same thing. Nevertheless, we import "Mt" as
mass
into the
Events
table so that the results table has a value to display if the user makes a mistake in analyzing the event.
"Index" is the
group_index
value of
Masterclass.Events
, the value between 1-100 that identifies this event
within its data group.
For the 2017 upgrade, this made for about 400 CSV files of different column formatting. Joel wrote a
Bash script to process these into a single
CIMA-master.csv
file. Once constructed and moved to
i2u2-db:/var/lib/mysql-files/
, this file can be easily imported into the
Masterclass
database with the command
mysql> LOAD DATA INFILE '/var/lib/mysql-files/CIMA-master.csv'
-> INTO TABLE Events
-> FIELDS TERMINATED BY ','
-> LINES TERMINATED BY '\n'
-> IGNORE 1 LINES;
This process ended up working well enough that we should stick to for future dataset upgrades, if possible.
Development History
CIMA was originally written by Stefan Schoppmann while a grad student at RWTH Aachen.
As of 2016, CIMA used data from 3000 CMS events divided into 30 groups. In February 2017, Joel made upgrades to
- Implement a new set of 10000 CMS events divided into 100 groups
- Improve the look of the particle selection panel (
table.tpl
) of fillOut.php
- Allow the user to manually enter mass for neutral-boson primary states (
NP
, formerly Z
)
To-Do
Joel keyworded code comments about the next round of upgrades as TASMANIA to make them greppable.
- Completely re-write the CSS for the particle selection panel (
/templates/table.tpl
). Bootstrap doesn't seem to be good for this application. In particular, fix the incomplete vertical divider and to allow the "NP" label to be changed to something longer without distorting the table.
- Overall, the CSS just plain needs to be sorted out and whipped into shape. It's a bit of a mess.
- Update the location tables in Tables to have `mass`, `group_index` columns.
- Put in an easier way to clear a Results table (in
fillOut.php
)
- Ken suggests an easier way to get rid of old MC groups, I think? Ask for clarification.
- Fix the histogram to stop the vertical auto-rescaling
- Fix the histogram so that mass values on the x-axis are centered on the dividers, not the bins
- Consider an option to automatically fill the histogram.
- Idea: students manually fill the dilepton and diphoton events, then auto-fill the rest?
- Idea: students have a histogram of their own data that they fill manually. Then, that data can be automatically combined with the rest of their location group to form one histogram. Then, all locations are automatically combined into one MCEvent histogram the way it is now.
- When the user selects both a final state and primary state, the
table.tpl
"Submit" button activates only if the primary state is selected last. It should work in either order.
- Both "mu" and "electron" can be selected in the table in
fillOut.php
. This shouldn't be. Fix in js/fcns.js
.
- An updated screencast showing the mass-entry and histogram creation process would be good. The current one links to leptoquark.
- Ken has an idea to make the interface more general. Particle selection has "Tracks" (electron, muon, photon, zoo) and "Number" (of tracks) (1,2,4,2+2 mixed) in one box, "Charge" (+,-,0,unknown) and "Mass" (entry box). This is adaptable to Masterclasses other than Z-mass. See attached scan.
-- Main.JoelG - 2017-01-10