Workflow
Preliminaries
Installing TiLiA
As our main annotation tool, we use the TimeLineAnnotator (aka. TiLiA), a GUI for producing and displaying complex annotations with video and audio files. For each piece in the corpus, a .tla file will be provided, which includes a recording aligned with a score element (musicXML), a complete beat grid, and an empty hierarchy timeline for creating the form analysis.
The newest version of the TiLiA app can be downloaded here or built from source using PyInstaller. Please reference the installation guide to troubleshoot common installation issues, such as safety settings blocking installation.
Ressources for learning TiLiA
Basic information about how to use TiLiA and its different features can be found at:
Miscellaneous Workflow Advice:
- Make sure that the file is saved in a non-temporary folder.
- The saved .tla files cannot be opened by simply clicking on them. This means you always have to open the app first and then the file from the app.
- Save your file regularly to avoid losing progress. You are advised to track changes via git to better remedy file corruption.
- If the app crashes while you have not saved for a long time, you can check whether a file has been saved in the ‘Autosaves’ folder (File => Open Autosaves Folder => Copy path; then File => Open => Select path to Autosaves folder and open the last saved version)
- Depending on what you find easier for the music, you may either proceed
- from larger sections to smaller ones in a top-down way: move the top-level up several times, create child, then split the child using shortcut
s
, add levels to split children etc., - or identify small formal units and group them upwards by selecting adjacent components (Ctrl-click or drawing a rectangle) and grouping them (shortcut: g).
- from larger sections to smaller ones in a top-down way: move the top-level up several times, create child, then split the child using shortcut
Please refer to the Principles page for more detailed guidance on how to use the app in the context of this project.
Using Git
A good understanding how git works (creating branches, committing & pushing changes, merge request & merge) is necessary to collaborate on our decentralised annotation project. There is plenty of learning material available online (e.g. Ry’s Tutorial) and you can use whatever you prefer. For a very brief reference of the commands you need to know, see the Quick Git Reference from the DCML corpus initiative, a predecessor of the current project on Musical Form. This reference contains a small collection of useful config options for your local git installation which can make your life easier. If you don’t know git, it may serve as a primer but does not replace a more thorough introduction to git.
Annotating during the pilot project
- We are not using OpenProject for workpackage management during the pilot because each annotator has a fixed-size work package predetermined.
- We provide prepared TiLiA files for the 48 pieces in the pilot corpus. The person who is to annotate one of them will
- create a new branch named after the file and your initials, e.g.
BWV1007_01_Prelude_jh
, - commit incremental changes with meaningful commit messages,
- and create a Pull Request (PR) (including a screenshot) once the analysis is complete.
- create a new branch named after the file and your initials, e.g.
Before you can start, you need to git clone git@github.com:MusicalForm/annotation_pilot.git
in order to create a local copy of our annotation_pilot repo. Cloning the repo recursively (using the command git clone --recursive git@github.com:MusicalForm/annotation_pilot.git
). i.e., including the submodules containing the original four corpora, is not necessary because the TiLiA files, recordings, etc. are included in the pilot repo, submodules remain untouched.
What’s in this repo?
You’ll find a breakdown of what’s included in the repo in its README.
Where to begin?
- Pick one of pieces assigned to you and create a new branch reflecting the corresponding file name, followed by your initials, e.g.
BWV1007_01_Prelude_jh
(to avoid name conflicts for doubly assigned pieces). Be sure that every new branch is created from an up-to-date main branch (in practice, yougit checkout main
andgit pull
before creating the new branch). - Open TiLiA and load the corresponding
.tla
file from theTiLiA
folder. The app will attempt to find the audio file but when it fails, you will see an error message and then be prompted to select the audio file manually. Either way, once you save the file, you should see the respective file path in the.tla
file change (the file is in JSON format). - You should be seeing the timelines listed under TiLiA Layout. The most important one is the empty hierarchy timeline called
Form
.
Annotating
- Create your form components and cadence markers. Save and commit every now and then.
- If no key segments are available, create those as well (hierarchy timeline with a single level).
- When you end up with two alternative views that you find equally convincing, add a new hierarchy timeline and encode the alternative for the given segment only.
- Whenever useful, add free-text comments to your analytical elements.
- Comments not specifically linked with a hierarchy component, e.g. comments on dramaturgical observations, can be added as markers.
- When problems arise, please let everyone know:
- For technical issues with our workflow, please open an issue in the
annotation_pilot
repo. - For anything related to the annotation standard, please open a discussion instead.
- For technical issues with our workflow, please open an issue in the
- When your analysis is done and all commits are pushed, create a Pull Request (PR) in the
annotation_pilot
repo. Please paste a screenshot of the analysis in the PR description in order to facilitate later discussions. Add a small summary focusing on any problems you may have encountered. In particular, be verbose about any segments where you have encoded two alternatives or where you were considering to do so.
For curators
Preparing a new TiLiA file for analysis
In terms of files you will need:
- a MusicXML file of the piece in question
- an unfolded
notes
TSV file as output by thems3
parser (ms3 extract -a -N -u
) - a public-domain audio recording or, in its absence, a synthesized rendition of the piece (we avoid YouTube renditions for reasons of data longevity)
In terms of software you’ll need:
- TiLiA, obviously
- ms3_realtime which includes an installation of the
ms3
parser
Aligning a score with an audio recording
Generate an aligned beat grid
The first step is to generate a <filename>.beatgrid.csv
file containing a beat grid for the score, mapped to the corresponding timecodes in the recording.
What you need:
- a working Python environment with Python 3.10 or newer (we recommend Miniconda)
- a MuseScore file of the piece in question
- an audio recording, possibly one in the public domain so that it can be shared
- first stop: IMSLP
- Creative Commons Search
- MusOpen (requires log-in)
- archive.org
- Musical Heritage Organization
Then you can:
- head to johentsch/ms3_realtime and do the first-time setup
- follow the instructions under Deriving timelines from aligned notes
Import the beatgrid into the TiLiA file
Instructions are in the same document over there @ ms3_realtime.
Import the score
- If the score contains annotations, you first need to remove them.
Preparing a repo: Batch alignment
- Download public domain recordings
Before anything, rename them so that they can easily be matched with the scores but without removing important identifiers. E.g. for the
bach_solo
repo, the recordings combine the piece IDs with the original IMSLP filename:BWV1007_01_Prelude-IMSLP717130-PMLP164349-Casals
.If the original files are in WAV format, use that for alignment but add a conversion to MP3 to the repo. E.g.:
for f in *.wav; do ffmpeg -i "$f" -vn -ac 2 -codec:a libmp3lame -q:a 0 "${f%.wav}.mp3"; done
- Make sure that the unfolded notes table match the recordings. E.g., in the case of
bach_solo
suites BWV 1007 and 1008, the two minuets came in two separate scores but in a single audio file following the “minuet 1 da capo”. Although in theory it would be conceivable to split the audio recording, the obvious solution here is to create a unified score which encodes the repeat structure of the encoding and to create the unfolded notes table from that. - Create a batch CSV file mapping audio to notes TSV files.
- Create a CSV file with the columns “audio” and “notes” containing the respective file paths. Paths should use
/
separators, not\
. Relative paths relative to the repo’s top level are advisable. - For controlling the output file names, an optional “name” column can be added.
- Create a CSV file with the columns “audio” and “notes” containing the respective file paths. Paths should use
- Align
ms3_realtime
approach (preferred)- Output CSV files for TiLiA import directly (
-tla
option). - Commit to the respective subfolder under
alignments
.
- Output CSV files for TiLiA import directly (
beat_this
approach (requirestorch
which can take time to set up)- Use beat_this to create one
.beats
file per recording - Use the
beats2measuregrid.py
script to convert the files to TiLiA’s CSV format.
- Use beat_this to create one
- Create TiLiA files manually.
Re-align an existing TiLiA file
This documents what to do in order to re-align the timelines in an existing TiLiA file with a different recording.
Requirements
- A recording featuring the same flow through the score as reflected in the TiLiA file.
- A MuseScore file that unfolds correctly to that same file.
- The TiLiA file needs to have a beat grid with the correct sequence of measure numbers.
For example, a 16-bar score in which the first 8 bars are repeated goes together with
- a recording that plays the first 8 bars twice, followed by bars 9-16,
- a TiLiA file with a beat grid that has measure numbers 1-8, then again 1-8, then 9-16.
Preliminaries
Clone https://github.com/TimeToAlign/dashboard and, in a new virtual Python environment, execute pip install -r requirements.txt
to install the dependencies.
Process
Export the TiLiA file to JSON (File => Export => JSON).
Follow the procedure above to create the
.beatgrid.csv
file which aligns the score with the new recording.Activate the correct virtual environment and execute
python processing/notebooks/tilia_warp.py path/to/tilia_export.json path/to/piece.beatgrid.csv
This will output one CSV file per timeline in the TiLiA file with adjusted timecodes.
Create a new TiLiA file, add the local recording file, and import the CSV files one by one.
Exporting a TiLiA file to one CSV file per timeline
Simply use the re-alignment script tilia_warp.py
passing only the exported JSON file as argument:
python processing/notebooks/tilia_warp.py path/to/tilia_export.json
By default, the CSV files will be exported to the same folder as the JSON file.