Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recognition of chords symbols #243

Open
Bacchushlg opened this issue Jan 27, 2019 · 10 comments
Open

recognition of chords symbols #243

Bacchushlg opened this issue Jan 27, 2019 · 10 comments
Labels
Chord symbols OCR to recognize chord names

Comments

@Bacchushlg
Copy link
Collaborator

Bacchushlg commented Jan 27, 2019

It would be fine to be able to assign text as chord symbols. Ideally identical to the behavior of lyrics: allow to span a lasso over the chord symbols and define as "chord symbol".
It is clear that the recognition is a bit different from normal text. Especially it is necessary to recognize the flat-b as "b".
Ideally there should be a set of allowed characters.
A...G, H (for german chords)
b, #, m, M, +, -, o, (o with / inside, for m7b5) (a triangle, for maj7)
sus, aug, add, maj, min, mi, alt
4, 5, 6, 7, 9, 11, 13

MuseScore allows all these variants of chord definition. So there is no need to interprete the chords.

I open a new task, although #44 already asked for this - but it is overloaded with other themes meanwhile

@webfischi
Copy link

Additional Symbols that are used are:
dim, dom, add, / (commonly used for 6/9 chords) and a very rare version of this,
with both numbers written over each other like a fraction but without the fraction bar.
chord symbols

The one with the slash inside and the triangle: ø, Δ
There are superscript variations, with just numbers and with numbers and text

@Bacchushlg
Copy link
Collaborator Author

thank you for this table.
Just a minor correction:

  • the triangle stands for maj7
  • 7+ means maj7, not 7#5

I have built up a table with the harmony expressions, used in musicxml, and the mostly found representations in scores:

Akkordsymbole.xlsx

It's for sure not complete, but it should be valid for 95% of the symbols that appear in typical scores.

@hbitteur
Copy link
Contributor

@Bacchushlg and @webfischi

The main difficulty is to parse the name of a chord, which is very often composed of letters intermixed with other signs (for example, a small circle to indicate a diminished chord).

The BIG challenge is to train Tesseract to recognize these names from the image pixels.
This is a problem on its own, that would require the help of someone really knowledgeable about Tesseract.

Could you take the action point of finding such person?

@webfischi
Copy link

webfischi commented Jan 31, 2019

thank you for this table.
Just a minor correction:

* the triangle stands for maj7

* 7+  means maj7, not 7#5

I have built up a table with the harmony expressions, used in musicxml, and the mostly found representations in scores:

Akkordsymbole.xlsx

It's for sure not complete, but it should be valid for 95% of the symbols that appear in typical scores.

I just found a website that shows all variations: https://www.songtive.com/chords/piano/Cmaj7

According to that only the upper case triangle is maj7 when it's written normal it is only major, very confusing...
But + for #5 or augmented is correct, at least according to the previous mentioneed website, other websites http://adamsguitars.com/content/guitarwebapp/index.php?md=4&rn=8&ct=16&fm=0
https://i.stack.imgur.com/Wpg8C.png including https://en.wikipedia.org/wiki/Augmented_seventh_chord
It'S not listed for maj7 https://en.wikipedia.org/wiki/Major_seventh_chord
maybe someone got confused because because -7 is the minor version. + always means augmented and never major to ensure consistency.

@hbitteur there is software vaialable to train the OCR https://github.com/OpaitSoftware/TesseractStudio.Net

@Bacchushlg
Copy link
Collaborator Author

According to that only the upper case triangle is maj7 when it's written normal it is only major, very confusing...

Here I found an interesting hint on Wikipedia:
https://en.wikipedia.org/wiki/Chord_names_and_symbols_(popular_music)#Seventh_chords
(see the notice [a] for major seventh)

But + for #5 or augmented is correct, at least according to the previous mentioned website, other websites

It seems that also here there exist 2 possibilities: normally the + sign before the 7 means augmented, while 7+ means maj7
See e.g. here:
https://de.wikipedia.org/wiki/Akkordsymbol#Übersicht (-> maj7)

@Bacchushlg
Copy link
Collaborator Author

The BIG challenge is to train Tesseract to recognize these names from the image pixels.
This is a problem on its own, that would require the help of someone really knowledgeable about Tesseract.

My idea was to use a different traindata file just for the chord recognition, that only contain those characters that are expected for chords. Here the "special" characters (triangle, circle, striked circle) could be assigned to some (for chords) unused characters that later can be properly mapped for the final analysis.

@hbitteur
Copy link
Contributor

@Bacchushlg
You are right. The OCR environment for chord names would be a specific one, different from the environment for the rest of the sheet.
In Tesseract, such environment is selected via a "language" name, such as "eng", "deu", "ita", and is passed as a parameter when Tesseract is called on an image buffer.

So, we would certainly need to build from scratch such a specific "language", trained on combination of all characters (letters and special signs) that can appear in a chord name, plus perhaps a dictionary (I don't remember if the dictionary is really mandatory).

How could we name such language?
"omr" could be a candidate, although it does not convey its focus on chord names...
Any other suggestion?

@hbitteur
Copy link
Contributor

@webfischi
Thanks for the link to TesseractStudio.Net
Unfortunately, it requires Tesseract version 4 and, as of this writing, Audiveris can operate only with version 3.04, because of the need for JavaCpp interface. Once Tesseract provides this interface for their version 4, we'll be able to make the move.

@Bacchushlg
Copy link
Collaborator Author

How could we name such language?
"omr" could be a candidate, although it does not convey its focus on chord names...

maybe "hmn" for HarMoNy. MuseScore handles chord names as such.

@webfischi
Copy link

It seems that also here there exist 2 possibilities: normally the + sign before the 7 means augmented, while 7+ means maj7
See e.g. here:
https://de.wikipedia.org/wiki/Akkordsymbol#Übersicht (-> maj7)

I think I got it now, C7+5 would be C7#5 (because there is also a +9 and a +11 variant) and C7+ could be Cmaj7 the problem is I don't now any examples and Sibelius doesn't show an option for this chord variant.

maybe "hmn" for HarMoNy. MuseScore handles chord names as such.

There is similar software for older versions:
http://vietocr.sourceforge.net/training.html

@hbitteur hbitteur added the Chord symbols OCR to recognize chord names label Nov 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Chord symbols OCR to recognize chord names
Projects
None yet
Development

No branches or pull requests

3 participants