1. dtd file downloads
In order to use the script to convert the Callisto output file to the i2b2 file, you need to annotate records under this dtd profile. The script itself will be released after the final submission.
Once you have downloaded the dtd file, you need to select “tools” in the menu bar and click “DTD Task Compiler” to load this dtd file. Finally, you can simply create a new annotation (select “File” in the menu bar and click “New”) by browsing the original record and select the task of “i2b2 Annotation for Medication”.
Annotation guide: you need to highlight the six categories (medication, dosage, mode, frequency, duration, reason) by their tags first, and then using the “link tag” to connect the dosage, mode, frequency, duration, reason to the related medication (binary pairs), such as dosage-medication, mode-medication, frequency-medication, duration medication and reason-medication.
Please be careful with punctuation. If the punctuation is inside the entity, you should tag it; otherwise an error will appear as errors in the convertor script (the extent will not match). Such as “vasopressin;”, this medication should be tagged as a whole.
Furthermore, please be careful with the word at the end of a line. The line break character (‘\n’) can get tagged together with the last word in the line if you are not careful (it's a bug in Callisto ); please don’t tag the line break character, otherwise the error will appear as an error in the convertor script (the extent will not match).
If you have any question, please send email to
To download the dtd file please click here
2. Callisto Convertor Software is released!
1. Archive Contents:
Readme.txt ------------------this file.
NameEntityTemp ------------- temporary folder
OutputTemp ----------------- temporary folder
RelationshipTemp ----------- temporary folder
CallistoSourceFile --------- folder for the Callisto output file
i2b2Output ----------------- the i2b2 formatted result
RawData -------------------- folder for original records
BuildGoldAnnotation.py------ script for processing the callisto output file
ExtractFromCallisto.py------ script for provide method to extract useful data from callisto output file
FormatChecker.py-------------script for check the annotation format
ListEngine.py----------------script for find the medication heading and it's span
RelationExtraction.py--------script for extract the binary relationship pair by each sentence
RelationExtraction_Bigram.py----- script for extract the binary relationship pair by biagram sentence
RunConvertor.py--------------script for run this software
SVMReorganizer_v1.py---------script for generate the i2b2 format
This convertor Software can transfer the Callisto annotation output to the i2b2 format.
However, we can not guarantee the medication entries are 100% correct. If the manual
annotation is correct, you can probably get more than 85% correct in the medication entries.
You should annotate records by the Callisto with the previous released dtd file.
This software is testes in the windows environment and requires python 2.5.2 to run. The software has
not been tested with earlier versions of Python, or the later versions of Python, and is not
guaranteed to run properly on those interpreters.
The annotation should be done according to the Annotation guide
It doesn't support overlapping annotation. More than one tag (medication, dosage, mode, frequency, duration, reason)
is assigned for one entity can not be handled.
However, the Link tag can overlap with other tags (medication, dosage, mode, frequency, duration, reason).
The discontinued medication is not supported, such as the m="pred forte 0.12...drops" for Pred Forte 0.12 b.i.d. drops".
You need check it manual after the results is generated.
NameEntityTemp, OutputTemp, RelationshipTemp are the temporary folders.
Do not put important data into these folders, since each run time; all data in these folders will be deleted.
1.Put the callistion output files to the CallistoSourceFile folder
2.Put the raw records to the RawData folder (the record should match again the callistion output file)
3.Using command line run FormatChecker.py to check the annotation format first.
If the "Callisto Format is OK!" is displayed, then you can go to the next step.
Some explanation for the error message:
"....please check the START position..." means the start point for the link tag do not match with the start point of entity
"....please check the END position..." means the end point for the link tag do not match with the end point of entity
"...please check the entity TAG...." means the entity pair connected by the link tag is not the medication pair (m-do, m- mo, m-f, m-du, m-r). Something like do-mo, f-r, etc.
"...Tag error: MULTI-TAGS for one entity...." means more than one tag (medication, dosage, mode, frequency, duration, reason) is assigned for one entity
4. Using command line run RunConvertor.py. when it was done, you can get the result in i2b2Output folder.
21/08/2009 --- Released to i2b2 participants
If you have any question, please send email to
To dowload the Callisto Convertor Software please click here