You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

313 lines
15 KiB

  1. // lat/determinize-lattice-pruned.h
  2. // Copyright 2009-2012 Microsoft Corporation
  3. // 2012-2013 Johns Hopkins University (Author: Daniel Povey)
  4. // 2014 Guoguo Chen
  5. // See ../../COPYING for clarification regarding multiple authors
  6. //
  7. // Licensed under the Apache License, Version 2.0 (the "License");
  8. // you may not use this file except in compliance with the License.
  9. // You may obtain a copy of the License at
  10. //
  11. // http://www.apache.org/licenses/LICENSE-2.0
  12. //
  13. // THIS CODE IS PROVIDED *AS IS* BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
  14. // KIND, EITHER EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION ANY IMPLIED
  15. // WARRANTIES OR CONDITIONS OF TITLE, FITNESS FOR A PARTICULAR PURPOSE,
  16. // MERCHANTABLITY OR NON-INFRINGEMENT.
  17. // See the Apache 2 License for the specific language governing permissions and
  18. // limitations under the License.
  19. #ifndef KALDI_LAT_DETERMINIZE_LATTICE_PRUNED_H_
  20. #define KALDI_LAT_DETERMINIZE_LATTICE_PRUNED_H_
  21. #include <fst/fst-decl.h>
  22. #include <fst/fstlib.h>
  23. #include <algorithm>
  24. #include <map>
  25. #include <set>
  26. #include <vector>
  27. #include "fstext/lattice-weight.h"
  28. // #include "hmm/transition-model.h"
  29. #include "itf/options-itf.h"
  30. #include "lat/kaldi-lattice.h"
  31. namespace fst {
  32. /// \addtogroup fst_extensions
  33. /// @{
  34. // For example of usage, see test-determinize-lattice-pruned.cc
  35. /*
  36. DeterminizeLatticePruned implements a special form of determinization with
  37. epsilon removal, optimized for a phase of lattice generation. This algorithm
  38. also does pruning at the same time-- the combination is more efficient as it
  39. somtimes prevents us from creating a lot of states that would later be pruned
  40. away. This allows us to increase the lattice-beam and not have the algorithm
  41. blow up. Also, because our algorithm processes states in order from those
  42. that appear on high-scoring paths down to those that appear on low-scoring
  43. paths, we can easily terminate the algorithm after a certain specified number
  44. of states or arcs.
  45. The input is an FST with weight-type BaseWeightType (usually a pair of
  46. floats, with a lexicographical type of order, such as
  47. LatticeWeightTpl<float>). Typically this would be a state-level lattice, with
  48. input symbols equal to words, and output-symbols equal to p.d.f's (so like
  49. the inverse of HCLG). Imagine representing this as an acceptor of type
  50. CompactLatticeWeightTpl<float>, in which the input/output symbols are words,
  51. and the weights contain the original weights together with strings (with zero
  52. or one symbol in them) containing the original output labels (the p.d.f.'s).
  53. We determinize this using acceptor determinization with epsilon removal.
  54. Remember (from lattice-weight.h) that CompactLatticeWeightTpl has a special
  55. kind of semiring where we always take the string corresponding to the best
  56. cost (of type BaseWeightType), and discard the other. This corresponds to
  57. taking the best output-label sequence (of p.d.f.'s) for each input-label
  58. sequence (of words). We couldn't use the Gallic weight for this, or it would
  59. die as soon as it detected that the input FST was non-functional. In our
  60. case, any acyclic FST (and many cyclic ones) can be determinized. We assume
  61. that there is a function Compare(const BaseWeightType &a, const
  62. BaseWeightType &b) that returns (-1, 0, 1) according to whether (a < b, a ==
  63. b, a > b) in the total order on the BaseWeightType... this information should
  64. be the same as NaturalLess would give, but it's more efficient to do it this
  65. way. You can define this for things like TropicalWeight if you need to
  66. instantiate this class for that weight type.
  67. We implement this determinization in a special way to make it efficient for
  68. the types of FSTs that we will apply it to. One issue is that if we
  69. explicitly represent the strings (in CompactLatticeWeightTpl) as vectors of
  70. type vector<IntType>, the algorithm takes time quadratic in the length of
  71. words (in states), because propagating each arc involves copying a whole
  72. vector (of integers representing p.d.f.'s). Instead we use a hash structure
  73. where each string is a pointer (Entry*), and uses a hash from (Entry*,
  74. IntType), to the successor string (and a way to get the latest IntType and
  75. the ancestor Entry*). [this is the class LatticeStringRepository].
  76. Another issue is that rather than representing a determinized-state as a
  77. collection of (state, weight), we represent it in a couple of reduced forms.
  78. Suppose a determinized-state is a collection of (state, weight) pairs; call
  79. this the "canonical representation". Note: these collections are always
  80. normalized to remove any common weight and string part. Define end-states as
  81. the subset of states that have an arc out of them with a label on, or are
  82. final. If we represent a determinized-state a the set of just its
  83. (end-state, weight) pairs, this will be a valid and more compact
  84. representation, and will lead to a smaller set of determinized states (like
  85. early minimization). Call this collection of (end-state, weight) pairs the
  86. "minimal representation". As a mechanism to reduce compute, we can also
  87. consider another representation. In the determinization algorithm, we start
  88. off with a set of (begin-state, weight) pairs (where the "begin-states" are
  89. initial or have a label on the transition into them), and the "canonical
  90. representation" consists of the epsilon-closure of this set (i.e. follow
  91. epsilons). Call this set of (begin-state, weight) pairs, appropriately
  92. normalized, the "initial representation". If two initial representations are
  93. the same, the "canonical representation" and hence the "minimal
  94. representation" will be the same. We can use this to reduce compute. Note
  95. that if two initial representations are different, this does not preclude the
  96. other representations from being the same.
  97. */
  98. struct DeterminizeLatticePrunedOptions {
  99. float delta; // A small offset used to measure equality of weights.
  100. int max_mem; // If >0, determinization will fail and return false
  101. // when the algorithm's (approximate) memory consumption crosses this
  102. // threshold.
  103. int max_loop; // If >0, can be used to detect non-determinizable input
  104. // (a case that wouldn't be caught by max_mem).
  105. int max_states;
  106. int max_arcs;
  107. float retry_cutoff;
  108. DeterminizeLatticePrunedOptions()
  109. : delta(kDelta),
  110. max_mem(-1),
  111. max_loop(-1),
  112. max_states(-1),
  113. max_arcs(-1),
  114. retry_cutoff(0.5) {}
  115. void Register(kaldi::OptionsItf* opts) {
  116. opts->Register("delta", &delta, "Tolerance used in determinization");
  117. opts->Register("max-mem", &max_mem,
  118. "Maximum approximate memory usage in "
  119. "determinization (real usage might be many times this)");
  120. opts->Register("max-arcs", &max_arcs,
  121. "Maximum number of arcs in "
  122. "output FST (total, not per state");
  123. opts->Register("max-states", &max_states,
  124. "Maximum number of arcs in output "
  125. "FST (total, not per state");
  126. opts->Register(
  127. "max-loop", &max_loop,
  128. "Option used to detect a particular "
  129. "type of determinization failure, typically due to invalid input "
  130. "(e.g., negative-cost loops)");
  131. opts->Register(
  132. "retry-cutoff", &retry_cutoff,
  133. "Controls pruning un-determinized "
  134. "lattice and retrying determinization: if effective-beam < "
  135. "retry-cutoff * beam, we prune the raw lattice and retry. Avoids "
  136. "ever getting empty output for long segments.");
  137. }
  138. };
  139. struct DeterminizeLatticePhonePrunedOptions {
  140. // delta: a small offset used to measure equality of weights.
  141. float delta;
  142. // max_mem: if > 0, determinization will fail and return false when the
  143. // algorithm's (approximate) memory consumption crosses this threshold.
  144. int max_mem;
  145. // phone_determinize: if true, do a first pass determinization on both phones
  146. // and words.
  147. bool phone_determinize;
  148. // word_determinize: if true, do a second pass determinization on words only.
  149. bool word_determinize;
  150. // minimize: if true, push and minimize after determinization.
  151. bool minimize;
  152. DeterminizeLatticePhonePrunedOptions()
  153. : delta(kDelta),
  154. max_mem(50000000),
  155. phone_determinize(true),
  156. word_determinize(true),
  157. minimize(false) {}
  158. void Register(kaldi::OptionsItf* opts) {
  159. opts->Register("delta", &delta, "Tolerance used in determinization");
  160. opts->Register("max-mem", &max_mem,
  161. "Maximum approximate memory usage in "
  162. "determinization (real usage might be many times this).");
  163. opts->Register(
  164. "phone-determinize", &phone_determinize,
  165. "If true, do an "
  166. "initial pass of determinization on both phones and words (see"
  167. " also --word-determinize)");
  168. opts->Register("word-determinize", &word_determinize,
  169. "If true, do a second "
  170. "pass of determinization on words only (see also "
  171. "--phone-determinize)");
  172. opts->Register("minimize", &minimize,
  173. "If true, push and minimize after "
  174. "determinization.");
  175. }
  176. };
  177. /**
  178. This function implements the normal version of DeterminizeLattice, in which
  179. the output strings are represented using sequences of arcs, where all but the
  180. first one has an epsilon on the input side. It also prunes using the beam
  181. in the "prune" parameter. The input FST must be topologically sorted in
  182. order for the algorithm to work. For efficiency it is recommended to sort
  183. ilabel as well. Returns true on success, and false if it had to terminate the
  184. determinization earlier than specified by the "prune" beam-- that is, if it
  185. terminated because of the max_mem, max_loop or max_arcs constraints in the
  186. options. CAUTION: you may want to use the version below which outputs to
  187. CompactLattice.
  188. */
  189. template <class Weight>
  190. bool DeterminizeLatticePruned(
  191. const ExpandedFst<ArcTpl<Weight> >& ifst, double prune,
  192. MutableFst<ArcTpl<Weight> >* ofst,
  193. DeterminizeLatticePrunedOptions opts = DeterminizeLatticePrunedOptions());
  194. /* This is a version of DeterminizeLattice with a slightly more "natural"
  195. output format, where the output sequences are encoded using the
  196. CompactLatticeArcTpl template (i.e. the sequences of output symbols are
  197. represented directly as strings The input FST must be topologically sorted in
  198. order for the algorithm to work. For efficiency it is recommended to sort the
  199. ilabel for the input FST as well. Returns true on normal success, and false
  200. if it had to terminate the determinization earlier than specified by the
  201. "prune" beam-- that is, if it terminated because of the max_mem, max_loop or
  202. max_arcs constraints in the options. CAUTION: if Lattice is the input, you
  203. need to Invert() before calling this, so words are on the input side.
  204. */
  205. template <class Weight, class IntType>
  206. bool DeterminizeLatticePruned(
  207. const ExpandedFst<ArcTpl<Weight> >& ifst, double prune,
  208. MutableFst<ArcTpl<CompactLatticeWeightTpl<Weight, IntType> > >* ofst,
  209. DeterminizeLatticePrunedOptions opts = DeterminizeLatticePrunedOptions());
  210. // /** This function takes in lattices and inserts phones at phone boundaries.
  211. // It
  212. // uses the transition model to work out the transition_id to phone map. The
  213. // returning value is the starting index of the phone label. Typically we
  214. // pick (maximum_output_label_index + 1) as this value. The inserted phones
  215. // are then mapped to (returning_value + original_phone_label) in the new
  216. // lattice. The returning value will be used by
  217. // DeterminizeLatticeDeletePhones() where it works out the phones according
  218. // to this value.
  219. // */
  220. // template<class Weight>
  221. // typename ArcTpl<Weight>::Label DeterminizeLatticeInsertPhones(
  222. // const kaldi::TransitionModel &trans_model,
  223. // MutableFst<ArcTpl<Weight> > *fst);
  224. //
  225. // /** This function takes in lattices and deletes "phones" from them. The
  226. // "phones"
  227. // here are actually any label that is larger than first_phone_label because
  228. // when we insert phones into the lattice, we map the original phone label
  229. // to (first_phone_label + original_phone_label). It is supposed to be used
  230. // together with DeterminizeLatticeInsertPhones()
  231. // */
  232. // template<class Weight>
  233. // void DeterminizeLatticeDeletePhones(
  234. // typename ArcTpl<Weight>::Label first_phone_label,
  235. // MutableFst<ArcTpl<Weight> > *fst);
  236. //
  237. // /** This function is a wrapper of DeterminizeLatticePhonePrunedFirstPass()
  238. // and
  239. // DeterminizeLatticePruned(). If --phone-determinize is set to true, it
  240. // first calls DeterminizeLatticePhonePrunedFirstPass() to do the initial
  241. // pass of determinization on the phone + word lattices. If
  242. // --word-determinize is set true, it then does a second pass of
  243. // determinization on the word lattices by calling
  244. // DeterminizeLatticePruned(). If both are set to false, then it gives a
  245. // warning and copying the lattices without determinization.
  246. //
  247. // Note: the point of doing first a phone-level determinization pass and
  248. // then a word-level determinization pass is that it allows us to
  249. // determinize deeper lattices without "failing early" and returning a
  250. // too-small lattice due to the max-mem constraint. The result should be
  251. // the same as word-level determinization in general, but for deeper
  252. // lattices it is a bit faster, despite the fact that we now have two passes
  253. // of determinization by default.
  254. // */
  255. // template<class Weight, class IntType>
  256. // bool DeterminizeLatticePhonePruned(
  257. // const kaldi::TransitionModel &trans_model,
  258. // const ExpandedFst<ArcTpl<Weight> > &ifst,
  259. // double prune,
  260. // MutableFst<ArcTpl<CompactLatticeWeightTpl<Weight, IntType> > > *ofst,
  261. // DeterminizeLatticePhonePrunedOptions opts
  262. // = DeterminizeLatticePhonePrunedOptions());
  263. //
  264. // /** "Destructive" version of DeterminizeLatticePhonePruned() where the input
  265. // lattice might be changed.
  266. // */
  267. // template<class Weight, class IntType>
  268. // bool DeterminizeLatticePhonePruned(
  269. // const kaldi::TransitionModel &trans_model,
  270. // MutableFst<ArcTpl<Weight> > *ifst,
  271. // double prune,
  272. // MutableFst<ArcTpl<CompactLatticeWeightTpl<Weight, IntType> > > *ofst,
  273. // DeterminizeLatticePhonePrunedOptions opts
  274. // = DeterminizeLatticePhonePrunedOptions());
  275. //
  276. // /** This function is a wrapper of DeterminizeLatticePhonePruned() that works
  277. // for
  278. // Lattice type FSTs. It simplifies the calling process by calling
  279. // TopSort() Invert() and ArcSort() for you.
  280. // Unlike other determinization routines, the function
  281. // requires "ifst" to have transition-id's on the input side and words on
  282. // the output side. This function can be used as the top-level interface to
  283. // all the determinization code.
  284. // */
  285. // bool DeterminizeLatticePhonePrunedWrapper(
  286. // const kaldi::TransitionModel &trans_model,
  287. // MutableFst<kaldi::LatticeArc> *ifst,
  288. // double prune,
  289. // MutableFst<kaldi::CompactLatticeArc> *ofst,
  290. // DeterminizeLatticePhonePrunedOptions opts
  291. // = DeterminizeLatticePhonePrunedOptions());
  292. /// @} end "addtogroup fst_extensions"
  293. } // end namespace fst
  294. #endif