Каталог / ТЕХНИЧЕСКИЕ НАУКИ / Теоретические основы информатики
скачать файл: 
- Название:
- Главач Мирослав Автоматическое чтение по губам с помощью LIpsID-признаков
- Альтернативное название:
- Главач Мирослав Автоматичне читання по губах за допомогою LIpsID-ознаків
- Краткое описание:
- Главач Мирослав Автоматическое чтение по губам с помощью LIpsID-признаков
ОГЛАВЛЕНИЕ ДИССЕРТАЦИИ
кандидат наук Главач Мирослав
Содержание
Содержание
Реферат
Synopsis
I Introduction
1. Introduction
2. Lipreading
2.1 Human lipreading
2.2 Automated lipreading
2.2.1 Visual Features Extraction
2.2.2 Extracted Features Processing
3. Dissertation Goals
3.1 Visual Speech Features Representation
3.2 New feature extraction method development
3.3 DNN Based Visual Speech Recognition
II Methodology
4. Statistical Models
4.1 Statistical Models of Shape
4.1.1 Landmarks
4.1.2 Aligning the Training Set
4.1.3 Modelling the Shape Variance
4.1.4 Model Generation and Constraints
4.1.5 Fitting the Model to New Points
4.2 Statistical Model of Appearance
4.2.1 Statistical Model of Texture
4.2.2 Combined Appearance Model
4.2.3 Image Warping
4.3 Active Shape Model
4.3.1 Modelling Local Structure
4.4 Active Appearance Model
4.4.1 AAM Search
4.4.2 Learning the Relation between 5c and SI
4.4.3 Iterative Model Refinement
5. Neural Networks
5.1 Artificial Neuron
5.1.1 Activation Functions
5.2 Neural Network Topology
5.2.1 Fully Connected Layer
5.2.2 Convolutional Layer
5.2.3 Response Normalisation Layers
5.2.4 Pooling Layers
5.2.5 Recurrent Layers
5.2.6 Softmax Layer
5.3 Training the Network
5.3.1 Cost Functions
5.3.2 Optimisation Algorithms
5.4 Deep Learning Frameworks
5.4.1 Caffe
5.4.2 Theano
5.4.3 Tensorflow
5.4.4 Torch7
5.4.5 CNTK
6. State-of-the-art methods for feature extraction and visual speech recognition
6.1 State-of-the-Art Methods for feature extraction
6.1.1 Chehra
6.1.2 Ensemble of Regression Trees
6.1.3 Improving Visual Features for Lip-reading
6.1.4 Per-speaker z-score Normalisation
6.1.5 VGG
6.1.6 ResNet
6.2 Visual Speech Recognition
6.2.1 View Independent Computer Lip-reading
6.2.2 Adaptive Multimodal Fusion by Uncertainty Compensation
6.2.3 LSTM Lipreading
6.2.4 Lip Reading in the Wild
6.2.5 LipNet
6.2.6 WLAS network
6.2.7 Transformer network
7. Datasets
7.1 Landmark and Object Detection Datasets
7.1.1 Helen
7.1.2 LFPW
7.1.3 ILSVRC2012
7.2 Audio-visual Speech Recognition Datasets
7.2.1 LiLIR
7.2.2 OuluVS
7.2.3 AV-TIMIT
7.2.4 TCD-TIMIT
7.2.5 AVICAR
7.2.6 GRID
7.2.7 LRW
7.2.8 LRS
IIIContribution to the state-of-the-art
8. Visual speech features analysis
8.1 Geometric features
8.2 Appearance features
8.3 Deep features
8.4 Feature use analysis
8.4.1 Height and width
8.4.2 Mutual information
8.4.3 Image quality
8.4.4 Appearance of tongue and teeth
8.4.5 DCT features
8.5 UWB-HSCAVC dataset extension
9. LipsID
9.1 Development of new deep visual features
9.2 LipsID using 3D convolutions
9.3 LipsID using ArcFace
9.4 Final form of LipsID features
10.Lipreading Experiments
10.1 The problem of feature normalisation
10.2 LipNet with LipsID
10.2.1 Results
10.3 AVSR with LipsID
10.3.1 Testing with TCD-TIMIT dataset
IVConclusion
11.Conclusio n
11.1 Thesis summary
11.2 Dissertation goals
11.2.1 Visual Speech Features Representation
11.2.2 New Feature Extraction Method Development
11.2.3 DNN Based Visual Speech Recognition
11.3 Future work
Список рисунков
Список таблиц
Список литературы
Публикации автора по теме диссертации
- Стоимость доставки:
- 230.00 руб