[LEADERG APP] Speech2Text

Language:

繁體中文

 

[Introduction]

 

LEADERG-Speech2Text inference.png

 

The Speech2Text APP can train audio files, use the tensorflow model to analyze the selected audio files, and output its category to reach speech to text.

 

[Operation steps and instructions]

 

1. Prepare dataset

The data set used by the APP is an audio file with three characters of bed, cat and happy, placed in the english_word/train folder, and select english_word in the Select Dataset.

 

If you want to use your own dataset, please copy the english _word folder and place it on the same level as english _word, delete all files and folders in the train folder, name the folder name with each word, and put it in wav audio files, each audio file takes about 1 second in length.

 

LEADERG-Speech2Text dataset.png

 

2. Train

 

Press Train to start training.

If you need to set a different Batch Size or training times, please fill in by yourself.

The trained model is placed in the model folder.

The check of Load Weight is whether to load the weight.

If it is the first training model, or there are other training words, for example, to train 3 words into 4 words, please uncheck.

If you have already trained the model, but want to continue training, and there are no new categories in the training folder, you can choose to load the weight to shorten the training time.

 

LEADERG-Speech2Text train.png

 

3. Inference

There are three kinds of inferences, inferring a single audio file, inferring a folder, and inferring a microphone.

 

If you need to select a model for inference, please select or enter the file name in the Inference Model Path area.

 

Select any file, it is normal that Weight Path only shows cp-XXX.ckpt. If the user wants to input the file name by himself, please input according to this format. Do not input cp-XXX.ckpt.index or cp-XXX.ckpt.data-00000-of-00001.

 

(1) Inferring a single audio file

 

Press the icon to select the wav file you want to infer.

 

LEADERG-Speech2Text inference wav file.png

 

(2) Inference folder

 

Press the icon to select the wav audio file folder location to be inferred.

 

LEADERG-Speech2Text inference folder.png

 

(3) Inference microphone

 

Press the pattern, turn on the microphone to record for 10 seconds, infer the content of the audio file per second within 10 seconds.

 

Please set the recording length in Record Length, in seconds.

 

LEADERG-Speech2Text inference microphone.png

Contact Us and How to Buy


Welcome to contact us. Please refer to the following link:

https://www.leaderg.com/article/index?sn=11059


Thanks for our customers

Taiwan University, Tsing Hua University, Yang Ming Chiao Tung University, Cheng Kung University, Taipei Medical University, Taipei University of Nursing and Health Sciences, National Chung Hsing University, Chi Nan University, Ilan University, United University, Defence University, Military Academy, Naval Academy, Feng Chia University, Chang Gung University, I-Shou University, Shih Chien University, Taiwan University of Science and Technology, Taichung University of Science and Technology, Yunlin University of Science and Technology, Chin-Yi University of Science, Formosa University, Pintung University of Science and Technology, Kaohsiung University of Science and Technology, Chaoyang University of Technology, Ming Chi University of Technology, Southern Taiwan University of Science and Technology, China University of Technology, Gushan Senior High School, Taipei Veterans General Hospital, Chang Gung Medical Foundation, Tzu Chi Medical Foundation, E-Da Hospital, Industry Technology Research Institute, Institute for Information Industry, Chung-Shan Institute of Science and Technology, Armaments Bureau, Ministry of Justice Investigation Bureau, Institute of Nuclear Energy Research, Endemic Species Research Institute, Institute of Labor, Occupational Safety And Health, Metal Industries Research & Development Centre, Taiwan Instrument Research Institute, Automotive Research & Testing Center, Taiwan Water Corporation, Taiwan Semiconductor Manufacturing Co., Ltd., United Microelectronics Corp., Nanya Technology, Winbond Electronics Corp., Xintec Inc., Arima Lasers Corporation, AU Optronics Corporation, Innolux Corporation, HannStar Display Corporation, Formosa Plastics Group., Formosa Technologies Corporation, Nan Ya Plastics Corp., Formosa Chemicals & Fibre Corporation, Chinese Petroleum Corporation, Logitech, ELAN Microelectronics Corp., Lextar Electronics Corporation, Darfon Electronics Corp., WPG Holdings, Mirle Automation Corporation, Symtek Automation Asia Co., Ltd, ChipMOS Technologies Inc., Dynapack International Technology Corporation, Primax Electronics Ltd., Feng Hsin Steel, China Ecotek, Grade Upon Technology Corp., AAEON Technology Inc., Stark Technology, Inc., Horng Terng Automation Co., Ltd., Zhen Ding Technology Holding Ltd, Boardtek Electronics Corporation, MiTAC International Corporation, Allion Labs, Inc., Sound Land Corp., Hong Hu Tech, etc.