forked from NVIDIA/NeMo
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
refactor tn data folder, and update of measure (NVIDIA#4028)
* refactor tn data folder, and update of measure Signed-off-by: Yang Zhang <[email protected]> * udpate jenkins Signed-off-by: Yang Zhang <[email protected]> * added whitelist with spaces for asr Signed-off-by: Yang Zhang <[email protected]>
- Loading branch information
Showing
62 changed files
with
307 additions
and
193 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
4 changes: 0 additions & 4 deletions
4
nemo_text_processing/text_normalization/en/data/magnitudes.tsv
This file was deleted.
Oops, something went wrong.
File renamed without changes.
File renamed without changes.
213 changes: 101 additions & 112 deletions
213
...xt_normalization/en/data/measurements.tsv → ...xt_normalization/en/data/measure/unit.tsv
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,129 +1,118 @@ | ||
f degree Fahrenheit | ||
°f degree Fahrenheit | ||
℉ degree Fahrenheit | ||
°F degree Fahrenheit | ||
amu atomic mass unit | ||
bar bar | ||
°c degree Celsius | ||
°C degree Celsius | ||
℃ degree Celsius | ||
km kilometer | ||
m meter | ||
cm2 square centimeter | ||
cm² square centimeter | ||
cm3 cubic centimeter | ||
cm³ cubic centimeter | ||
cm centimeter | ||
mm millimeter | ||
ha hectare | ||
mi mile | ||
m² square meter | ||
m2 square meter | ||
km² square kilometer | ||
km2 square kilometer | ||
cwt hundredweight | ||
db decibel | ||
dm3 cubic decimeter | ||
dm³ cubic decimeter | ||
dm decimeter | ||
ds decisecond | ||
°f degree Fahrenheit | ||
°F degree Fahrenheit | ||
℉ degree Fahrenheit | ||
ft foot | ||
% percent | ||
ghz gigahertz | ||
gw gigawatt | ||
gwh gigawatt hour | ||
hz hertz | ||
kw kilowatt | ||
kW kilowatt | ||
hp horsepower | ||
mg milligram | ||
" inch | ||
kbps kilobit per second | ||
kcal kilo calory | ||
kgf kilogram force | ||
kg kilogram | ||
ghz gigahertz | ||
khz kilohertz | ||
mhz megahertz | ||
km2 square kilometer | ||
km² square kilometer | ||
km kilometer | ||
kpa kilopascal | ||
kwh kilowatt hour | ||
kw kilowatt | ||
kW kilowatt | ||
lb pound | ||
lbs pound | ||
v volt | ||
h hour | ||
mc mega coulomb | ||
s second | ||
nm nanometer | ||
rpm revolution per minute | ||
min minute | ||
mA milli ampere | ||
kwh kilo watt hour | ||
m³ cubic meter | ||
m2 square meter | ||
m² square meter | ||
m3 cubic meter | ||
mph mile per hour | ||
mv milli volt | ||
mw megawatt | ||
μm micrometer | ||
" inch | ||
tb terabyte | ||
cc c c | ||
g gram | ||
da dalton | ||
atm atmosphere | ||
ω ohm | ||
db decibel | ||
ps peta second | ||
oz ounce | ||
hl hecto liter | ||
μg microgram | ||
pg petagram | ||
gb gigabyte | ||
MB megabyte | ||
GB gigabyte | ||
TB terabyte | ||
PB petabyte | ||
EB exabyte | ||
ZB zettabyte | ||
YB yottabyte | ||
kb kilobit | ||
ev electron volt | ||
mb megabyte | ||
kb kilobyte | ||
kbps kilobit per second | ||
m³ cubic meter | ||
mbps megabit per second | ||
kl kilo liter | ||
tj tera joule | ||
kv kilo volt | ||
mv mega volt | ||
kn kilonewton | ||
mm megameter | ||
au astronomical unit | ||
yd yard | ||
rad radian | ||
lm lumen | ||
hs hecto second | ||
mol mole | ||
gpa giga pascal | ||
mg milligram | ||
mhz megahertz | ||
mi2 square mile | ||
mi² square mile | ||
mi mile | ||
min minute | ||
ml milliliter | ||
gw gigawatt | ||
ma mega ampere | ||
kt knot | ||
kgf kilogram force | ||
ng nano gram | ||
mm2 square millimeter | ||
mm² square millimeter | ||
mol mole | ||
mpa megapascal | ||
mph mile per hour | ||
ng nanogram | ||
nm nanometer | ||
ns nanosecond | ||
ms mega siemens | ||
bar bar | ||
gl giga liter | ||
μs microsecond | ||
oz ounce | ||
pa pascal | ||
ds deci second | ||
ms milli second | ||
dm deci meter | ||
dm³ cubic deci meter | ||
dm3 cubic deci meter | ||
amu atomic mass unit | ||
mb megabit | ||
mf mega farad | ||
bq becquerel | ||
pb petabit | ||
mm² square millimeter | ||
mm2 square millimeter | ||
cm² square centimeter | ||
cm2 square centimeter | ||
cm³ cubic centimeter | ||
cm3 cubic centimeter | ||
sq mi square mile | ||
mi² square mile | ||
mi2 square mile | ||
% percent | ||
rad radian | ||
rpm revolution per minute | ||
sq ft square foot | ||
kpa kilopascal | ||
cd candela | ||
tl tera liter | ||
ms mega second | ||
mpa megapascal | ||
pb peta byte | ||
gwh giga watt hour | ||
kcal kilo calory | ||
gy gray | ||
sq mi square mile | ||
sv sievert | ||
cwt hundredweight | ||
cc c c | ||
tb terabyte | ||
tj terajoule | ||
tl teraliter | ||
v volt | ||
yd yard | ||
μg microgram | ||
μm micrometer | ||
μs microsecond | ||
ω ohm | ||
atm ATM | ||
au AU | ||
bq BQ | ||
cc CC | ||
cd CD | ||
da DA | ||
eb EB | ||
ev EV | ||
f F | ||
gb GB | ||
g G | ||
gl GL | ||
gpa GPA | ||
gy GY | ||
ha HA | ||
h H | ||
hl HL | ||
hp GP | ||
hs HS | ||
kb KB | ||
kl KL | ||
kn KN | ||
kt KT | ||
kv KV | ||
lm LM | ||
ma MA | ||
mA MA | ||
mb MB | ||
mc MC | ||
mf MF | ||
m M | ||
mm MM | ||
ms MS | ||
mv MV | ||
mw MW | ||
pb PB | ||
pg PG | ||
ps PS | ||
s S | ||
tb TB | ||
tb YB | ||
zb ZB |
43 changes: 43 additions & 0 deletions
43
nemo_text_processing/text_normalization/en/data/measure/unit_alternatives.tsv
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
atm atmosphere | ||
bq becquerel | ||
cd candela | ||
da dalton | ||
eb exabyte | ||
f degree Fahrenheit | ||
gb gigabyte | ||
g gram | ||
gl gigaliter | ||
ha hectare | ||
h hour | ||
hl hectoliter | ||
hp horsepower | ||
hp horsepower | ||
kb kilobit | ||
kb kilobyte | ||
ma megaampere | ||
mA megaampere | ||
ma milliampere | ||
mA milliampere | ||
mb megabyte | ||
mc megacoulomb | ||
mf megafarad | ||
m meter | ||
m minute | ||
mm millimeter | ||
mm millimeter | ||
mm millimeter | ||
ms megasecond | ||
ms mega siemens | ||
ms millisecond | ||
mv millivolt | ||
mV millivolt | ||
mw megawatt | ||
mW megawatt | ||
pb petabyte | ||
pg petagram | ||
ps petasecond | ||
s second | ||
tb terabyte | ||
tb terabyte | ||
yb yottabyte | ||
zb zettabyte |
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
13 changes: 13 additions & 0 deletions
13
nemo_text_processing/text_normalization/en/data/ordinal/__init__.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. |
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,3 +6,7 @@ CLASS | |
PART | ||
Part | ||
part | ||
article | ||
Article | ||
Section | ||
section |
Oops, something went wrong.