This gives me the correspondence between letters and sounds. The convYX file (two columns, tab separated) is more useful to me: b|a|b|y|l|o|n| B|AE|B|AH|L|AA|N|į|e:r|t|i|l|i|z|e|d| F|ER|T|AH|L|AY|Z|_|D| They seem to be probabilities for a particular spelling to correspond to a particular set of phonemes. The convYX.align file has lines like this: th Tĕ.12597733532431e-46 I looked at the makefile and wondered if I really needed the linker flag -lgcc_s. It's written for Linux/gcc and I'm trying to compile on Mac/clang. G++ -O3 -ffast-math -funroll-all-loops -fpeel-loops -ftracer -funswitch-loops -funit-at-a-time -pthread mmAligner.o mmEM.o -o m2m-aligner -lgcc_s -lpthread -lc -lmĬlang: error: linker command failed with exit code 1 (use -v to see invocation) : optimization flag '-funswitch-loops' is not supported Warning: clang: optimization flag '-funswitch-loops' is not supported warning Warning: clangoptimization flag '-ftracer' is not supported : Optimization flag '-fpeel-loops' is not supported Ĭlang: clang: warning: optimization flag '-fpeel-loops' is not supported warning:Ĭlangoptimization flag '-ftracer' is not supported : Warning: clang: optimization flag '-funroll-all-loops' is not supported warning: G++ -O3 -ffast-math -funroll-all-loops -fpeel-loops -ftracer -funswitch-loops -funit-at-a-time -pthread -c -I./tclap-1.2.1/include/ mmEM.cpp -o mmEM.oĬlangclang: : warning: optimization flag '-funroll-all-loops' is not supported G++ -O3 -ffast-math -funroll-all-loops -fpeel-loops -ftracer -funswitch-loops -funit-at-a-time -pthread -c -I./tclap-1.2.1/include/ mmAligner.cpp -o mmAligner.o An example in their paper is the word accuse, which maps to sounds AH K Y UW Z. Letter-phoneme alignment decides which letters correspond to each phoneme. I found a paper by Jiampojamarn and Kondrak: Letter-Phoneme Alignment: An Exploration. It's not going to be perfect, as English spelling is pretty weird and not necessarily decomposible in this way. I wanted to find something that told me which letters corresponded to each sound. I wasn't able to figure out how to override spelling. However it didn't tell me which letters corresponded to each sound. It spelled some things well: kirk → cherk, kurch (instead of chirk, kirch) and other things poorly: clerc, instead of clerk or klurk. ![]() Last time I trained a neural network to turn spelling ( graphemes) into sounds ( phonemes). To make these substitutions properly I need to know the sounds corresponding to the letters. But it also turns illinois (pronounced without the final s) into illinoiz which is not what the goblins expected. But it also turns valet into vale', which may or may not be what I wanted. It also misses monarchs which should become monarks because the ch in that word is pronounced K. ![]() This means I need to know which x sounds are K S and which are Z. But it would also turn xerox into kseroks I want it to become xeroks. For example the word words would be written wordz.įor these spelling changes the simplest thing to do would be to use regexp string substitution.
0 Comments
Leave a Reply. |