1
.\" Hey, EMACS: -*- nroff -*-
2
.TH RU_TTS_TRANSFER 3 "January 11, 2023"
4
ru_tts_transfer \- transfer specified Russian text to speech
9
.B typedef int (*ru_tts_callback)(void *buffer, size_t size, void *user_data);
11
.BI "void ru_tts_transfer(ru_tts_conf_t *" config ", char *" text \
12
", void *" wave_buffer ", size_t " wave_buffer_size \
13
", ru_tts_callback " wave_consumer ", void *" user_data);
15
.BI "void ru_tts_config_init(ru_tts_conf_t *" config);
20
function transfers text pointed by
22
argument into digitized sound in the raw linear signed 8-bit 10 kHz
23
format. The source text should be represented by zero-terminated
24
string containing Russian text in \fBkoi8\-r\fP charset. Symbols
25
\(oq+\(cq and \(oq=\(cq immediately after a vowel are treated as
26
strong and weak stress sign respectively. The resulting
27
data are fed to the callback function referenced by
29
argument chunk by chunk via buffer specified by
35
argument is passed to the callback as a pointer to any additional data
38
Various speech synthesis control options can be passed via
40
data structure pointed by the
42
argument that contains the following fields:
50
int general_gap_factor;
53
int semicolon_gap_factor;
55
int question_gap_factor;
56
int exclamation_gap_factor;
57
int intonational_gap_factor;
63
This structure should be initialized by the
64
.BR ru_tts_config_init ()
65
function that fills it by the default values.
67
All numeric values represent a percentage of the corresponding
68
parameter normal level. Initially they are set to 100. Each parameter
69
has its own reasonable value range, but out of range values do not
70
cause any problem since they are treated as the nearest boundary of
74
Speech rate in percents of the normal level. Reasonable value range is
78
Voice pitch in percents of the normal level. Reasonable value range is
82
Voice pitch variation range. It can vary from 0 (absolutely monotonic
83
speech) up to 140 (a bit more expressive than normal).
86
Percentage factor applied to all interclause gaps. Its lower boundary
87
is 0 that means no gaps at all. The maximum proportionally depends on
88
the speech rate. On normal rate it is approximately 312.
91
Relative duration of the gap implied by comma encountering. Reasonable
92
value range is from 0 up to 750.
95
Relative duration of the gap implied by dot encountering. Reasonable
96
value range is from 0 up to 500.
98
.I semicolon_gap_factor
99
Relative duration of the gap implied by semicolon
100
encountering. Reasonable value range is from 0 up to 600.
103
Relative duration of the gap implied by colon encountering. Reasonable
104
value range is from 0 up to 600.
106
.I question_gap_factor
107
Relative duration of the gap implied by question mark
108
encountering. Reasonable value range is from 0 up to 375.
110
.I exclamation_gap_factor
111
Relative duration of the gap implied by exclamation mark
112
encountering. Reasonable value range is from 0 up to 300.
114
.I intonational_gap_factor
115
Relative duration of purely intonational gaps not caused by a
116
punctuation. Reasonable value range is from 0 up to 1000.
119
Additional flags. The following flag constants being
124
Treat point inside a number as decimal separator. Initially this flag
128
Treat comma inside a number as decimal separator. Initially this flag
131
.B USE_ALTERNATIVE_VOICE
132
Use alternative (female) voice instead of the default (male)
133
one. Initially this flag is not set.
135
It is suggested that the user provided callback function takes further
136
responsibility on the generated data. It may play it immediately or
137
store somewhere or do whatever it is designed for. This function
138
should return 0 in usual circumstances. Non-zero return value causes
139
immediate transfer stop.
141
Igor B. Poretsky <poretsky@mlbox.ru>.