[Ȩ][±â¼ú][¼öÇÐÀÌ·Ð][¼ÒÇÁÆ®¿þ¾î][»ç¿ë¹ý][¸µÅ©][Q&A]
To download the programs, go to the Download page.
kPhonetica
Speech waveform editing
Spectrogram display
Energy/Pitch/Formant contours
Manual/Automatic labeling
Formant chart, Q-tone level (QTL)
For detailed information, see the Manual page.
kWaves
Speech waveform editing
Spectrogram display
Manual labeling
Automatic aligning
Pitch detection
Speech analysis
Feature extraction
For detailed information, see the Manual page.
ezCSR (Easy Continuous Speech Recognizer)
Functions:
Can recognize isolated words or continuous speech
Can support two kinds of grammar: n-gram, finite state network (FSN)
Can run multiple speech recognizers with different vocabulary and grammar
Can configure speech recognizers flexibly for most of purposes
Can handle multiple pronunciation
Support input speech from sound devices or files
Support MFCC, FBANK, LPCC, PLP feature
Support continuous hidden Markov modeling
Support FSN-based search and tree-based search
Support Hangul (Korean language) dictionary
Support decision tree-based and mapping-based state sharing
Support (reduced) file formats compatible with the HTK
Support automatic speech detection
Systems/Compilers Supported:
Windows 2000 Professional/Microsoft Visual C++ 6.0
FreeBSD/GNU C++ v2.95.2 (except audio I/O module)
This software requires the Easy Matrix Template Library (ezMTL).
ezMTL (Wasy Matrix Templay Library)
C++ template library for matrix and vector algebra
ISytems/Compilers:
Windows 32/Visual C++ 6.0
FreeBSD/GNU C++ 2.95.2
Functions:
- Most of elementary matrix operations (+, -, *, /, transpose, conjugate, ...)
- Some of basic statistical operations (sum, mean, var, std, skewness, kurtosis, ...)
- Checking matrix properties (symmetric, Hermitian, positive definite, diagonal, triangular, zero, identity, ...)
- Cholesky factorization for square real/complex symmetric positive definite matrices
- LU factorization for square real/complex matrices
- QR factorization for general real/complex matrices
- Singular value decompostion for general real matrices
- Hessenberg form for general real/complex matrices
- Schur decomposition for general real/complex matrices
- BKP decomposition for general real matrices
- Eigenvectors and Eigenvalues for symmetric/Hermitian matrices
- Eigenvectors and Eigenvalues for general real/complex matrices
- Inverse, pseudoinverse, determinant, p-norm
- Rank, orthogonal/null space
- Condition numbers
- Solving linear equations
- Matrix exponential, logarithm, square root, power
- Random number generation (uniform, normal, exponential, gamma, beta, t, fisher, chi-square, binomial, poisson, ...)
Notes:
This library was designed to make MATLAB codes run in C++. Most of method names are compatible to MATLAB commands except the first character is capital letter. This library is partly based on the Meschach Library.
hantagger
Hangeul part-of-speech (POS) tagging tool
This is a quick-and-dirty program to segment hangeul sentences
into
morphemes. It uses the definition of morpheme developed at KAIST, Korea.
This
program is useful when a simple Korean tagging system is desired.
I cannot
gurantee the accuracy or correctness of the program.
I open the program
because it will be useful in some applications.
Applications
targeted:
Morpheme-dependent processing of Korean text-to-pronunciation
conversion
Simple tagging for speech synthesis systems
Quick-and-dirty implementation of Korean parsers
The followings are some
tagged examples.
in=°ÀÇÀÇ ¼öÁØ ÀÚ¶ó°í ÀÖ´Ù
out=°ÀÇ/ncn+ÀÇ/jcm ¼öÁØ/ncn ÀÚ¶ó/pvg+°í/ecx
ÀÖ/px+´Ù/ef
in=À̰ÍÀº ÀÚ¶ó°í »ý°¢ÇÑ´Ù °¥ ¼ö ¾ø´Â ³ª¶óÀÔ´Ï´Ù
out=À̰Í/npd+Àº/jxt ÀÚ¶ó/pvg+°í/ecc
»ý°¢/ncpa+ÇÏ/xsv+¤¤´Ù/ecs °¡/pvg+¤©/etm ¼ö/nbn ¾ø/paa+´Â/etm ³ª¶ó/ncn+ÀÌ/jp+¤²´Ï´Ù/ef
in=°ÀÇÀÇ
´Ü¿Ê°ñ °¥¼ö°¢ °¥¼ö°¢Àº °£´Ù
out=°ÀÇ/ncn+ÀÇ/jcm ´Ü¿Ê°ñ/nq °¥¼ö°¢/nq °¥¼ö°¢/nq+Àº/jxt °¡/px+¤¤´Ù/ef
in=±Ç¿À¿íÀº °íÀ¯¸í»ç Å×½ºÆ®ÇÕ´Ï´Ù
out=±Ç¿À¿í/nq+Àº/jxt °íÀ¯/ncn+¸í»ç/ncn Å×½ºÆ®/ncpa+ÇÏ/xsv+¤²´Ï´Ù/ef
hanttp
This is a quick-and-dirty program to convert Korean texts into
pronunciations.
The program takes into account part-of-speech information of
eojeols
and needs a Korean tagging system included in the current
distribution.
If you are sensitive to the speed of conversion, you should
revise
the program for your own application.
I open the program because it
will be useful in some applications.
Applications:
Pronunciation
generation in variable-vocabulary speech recognizers
Automatic generation
of pronunciation dictionary for large vocabulary speech recognizers
Text
preprocessing for speech synthesis
The followings are some examples of
the conversion (version 1.0).
in : ÀÇÇÐ ÀÇ»ç ȸÀÇ½Ç °¡Á¤ÀÇÇÐ °¨´Ù °¥ °Í °°Àº »ç¶÷ÀÇ ¾ÉÈ÷´Ù ±»È÷´Ù
°°ÀÌ °ÀÇÀÇ Çý·Ê¿¹Çý ¹àÀº ¤©°Í°° Ãø½ÉÀÇ °¡´É¼º ÀÇÀǸ¦ µÐ´Ù
out: ÀÇÇÐ ÀÇ»ç ÈÑÀÌ½Ç °¡Á¤ÀÇÇÐ °¨µû °¥ ²¨¤§ °¡Æ° »ç¶ó¸Þ ¾ÈÄ¡´Ù ±¸Ä¡´Ù
°¡Ä¡ °ÀÌ¿¡ Çì·Ê¿¹Çì ¹ß±Ù ¤©²¨¤§±î¤§ Ãø¾¾¹Ì °¡´É½é ÀÇÀ̸¦ µÐ´Ù
in : ½Å¶óÀÇÀÇ»ç½Å¶óµ¶¸³¸¸¼¼±¹È¸¹àÀº¼¼»óÈñ¸Áȸ»çÈѹ決
out:
½Ç¶óÀÌÀÌ»ç½Ç¶óµ¿´Ô¸¸¼¼±¸Äù¹ß±Ù¼¼»óÈ÷¸ÁÈÑ»çÈѹ決
split2eojeol
This program is to segment hangul sentences into
eojeols.
Usage:
split2eojeol.exe [-sentence "A"] [-compare "A" "B" ...]
[-compound "A"]
[-print_org] [-minWordLen 1] [-maxWordLen 12] [-lm LM_file]
[-vocab vocab_file]
[-beamSize 10] [-penInsertion 0.0] [-splitTh 1] [-dur_wt
1.0] [-keepSpace] < file
To see detailed message, use the option
-h.
Options:
-h|help help
-sentence "A" split
the sentence A
-compare "A" "B" compare between A and B
-compound "A" split a compound noun A
-print_org print
original input
Advanced options:
-minWordLen minimum word
length after spacing
-maxWordLen maximum word length after
spacing
-lm ARPA language model file
-vocab
non-compound noun file
-beamSize beam size
-penInsertion penalty for space insertion
-splitTh split
input longer than or equal to the threshold
-dur_wt weight
for word length score
-keepSpace keep original
spaces
Examples:
split2eojeol < Column.txt
split2eojeol -sentence "À̰ÍÀ»¶ç¾î¾²±âÇϸé¾î¶»°ÔµÉ±î¿ä"
split2eojeol -compare
"¾î¶²°Ô¸ÂÀ»±î¿ä" "¾î¶²°Ô ¸ÂÀ»±î¿ä" "¾î¶² °Ô ¸ÂÀ»±î¿ä"
split2eojeol -compound
"±âÃÊÁö¹æÀÚÄ¡´ÜüÇàÁ¤Àü»êÈ"
TIMIT48
The TIMIT is a basic speech database in speech recognition fields. It is often
used not ony for speaker independent phoneme recognition but also for speaker
recognition, microphone adaptation and channel adaptation.
When one wants to
experiment using the HTK with the TIMIT database, there are many steps and
parameters to be set properly.
The attached script can be used to do all the
required steps and parameter setting in context-independent and
context-dependent phoneme recognition with just a single command.
To download the above programs, go to the Download page.
[Ȩ][±â¼ú][¼öÇÐÀÌ·Ð][¼ÒÇÁÆ®¿þ¾î][»ç¿ë¹ý][¸µÅ©][Q&A]
© 2009 Oh-Wook Kwon.