btt²©ÌìÌÃ

EN

Ã÷ÂԿƼ¼BlockformerÓïÒôʶ±ðÄ£×ÓÔÚAISHELL-1²âÊÔ¼¯ÉÏÈ¡µÃSOTAЧ¹û

2022-09-13

Ã÷ÂԿƼ¼¼´½«¿ªÔ´BlockformerÓïÒôʶ±ðÄ£×Ó£¬£¬£¬ £¬£¬£¬£¬£¬ÌáÉýÏúÊÛÀú³ÌÖеĻỰÖÇÄÜ£¬£¬£¬ £¬£¬£¬£¬£¬ÖúÁ¦¸÷ÐÐÒµÊýÖÇ»¯×ªÐÍ¡£¡£¡£¡£¡£¡£¡£¡£

Éî¶ÈѧϰÒÑÀÖ³ÉÓ¦ÓÃÓÚÓïÒôʶ±ð£¬£¬£¬ £¬£¬£¬£¬£¬ÖÖÖÖÉñ¾­ÍøÂç±»¸÷ÈËÆÕ±éÑо¿ºÍ̽Ë÷£¬£¬£¬ £¬£¬£¬£¬£¬ÀýÈ磬£¬£¬ £¬£¬£¬£¬£¬Éî¶ÈÉñ¾­ÍøÂ磨Deep Neural Network£¬£¬£¬ £¬£¬£¬£¬£¬DNN£©¡¢¾í»ýÉñ¾­ÍøÂ磨Convolutional Neural Network£¬£¬£¬ £¬£¬£¬£¬£¬CNN£©¡¢Ñ­»·Éñ¾­ÍøÂ磨Recurrent Neural Network£¬£¬£¬ £¬£¬£¬£¬£¬RNN£©ºÍ¶Ëµ½¶ËµÄÉñ¾­ÍøÂçÄ£×Ó¡£¡£¡£¡£¡£¡£¡£¡£

ÏÖÔÚ£¬£¬£¬ £¬£¬£¬£¬£¬Ö÷ÒªÓÐÈýÖֶ˵½¶ËµÄÄ£×Ó¿ò¼Ü£ºÉñ¾­ÍøÂç´«¸ÐÆ÷£¨Neural Transducer£¬£¬£¬ £¬£¬£¬£¬£¬NT£©£¬£¬£¬ £¬£¬£¬£¬£¬»ùÓÚ×¢ÖØÁ¦µÄ±àÂëÆ÷-½âÂëÆ÷£¨Attention-based Encoder Decoder£¬£¬£¬ £¬£¬£¬£¬£¬AED£©ºÍÅþÁ¬Ê±Ðò·ÖÀࣨConnectionist Temporal Classification£¬£¬£¬ £¬£¬£¬£¬£¬CTC£©¡£¡£¡£¡£¡£¡£¡£¡£

NTÊÇCTCµÄÔöÇ¿°æ±¾£¬£¬£¬ £¬£¬£¬£¬£¬ÒýÈëÁËÕ¹ÍûÍøÂçÄ£¿£¿£¿£¿ £¿£¿é£¬£¬£¬ £¬£¬£¬£¬£¬¿ÉÀà±È¹Å°åÓïÒôʶ±ð¿ò¼ÜÖеÄÓïÑÔÄ£×Ó£¬£¬£¬ £¬£¬£¬£¬£¬½âÂëÆ÷ÐèÒª°ÑÏÈǰչÍûµÄÀúÊ·×÷ΪÉÏÏÂÎÄÊäÈë¡£¡£¡£¡£¡£¡£¡£¡£NTѵÁ·²»ÎȹÌ£¬£¬£¬ £¬£¬£¬£¬£¬ÐèÒª¸ü¶àÄڴ棬£¬£¬ £¬£¬£¬£¬£¬Õâ¿ÉÄÜ»áÏÞÖÆÑ·üçٶÈ¡£¡£¡£¡£¡£¡£¡£¡£

AEDÓɱàÂëÆ÷£¬£¬£¬ £¬£¬£¬£¬£¬½âÂëÆ÷ºÍ×¢ÖØÁ¦»úÖÆÄ£¿£¿£¿£¿ £¿£¿é×é³É£¬£¬£¬ £¬£¬£¬£¬£¬Ç°Õß¶ÔÉùÑ§ÌØÕ÷¾ÙÐбàÂ룬£¬£¬ £¬£¬£¬£¬£¬½âÂëÆ÷ÌìÉú¾ä×Ó£¬£¬£¬ £¬£¬£¬£¬£¬×¢ÖØÁ¦»úÖÆÓÃÀ´¶ÔÆë±àÂëÆ÷ÊäÈëÌØÕ÷Ï¢ÕùÂë״̬¡£¡£¡£¡£¡£¡£¡£¡£ÒµÄÚ²»ÉÙASRϵͳ¼Ü¹¹»ùÓÚAED¡£¡£¡£¡£¡£¡£¡£¡£È»¶ø£¬£¬£¬ £¬£¬£¬£¬£¬AEDÄ£×ÓÖð¸öµ¥Î»Êä³ö£¬£¬£¬ £¬£¬£¬£¬£¬ÆäÖÐÿ¸öµ¥Î»¼ÈÈ¡¾öÓÚÏÈËÞÊÀ³ÉµÄЧ¹û£¬£¬£¬ £¬£¬£¬£¬£¬ÓÖÒÀÀµºóÐøµÄÉÏÏÂÎÄ£¬£¬£¬ £¬£¬£¬£¬£¬Õâ»áµ¼ÖÂʶ±ðÑÓ³Ù¡£¡£¡£¡£¡£¡£¡£¡£

ÁíÍ⣬£¬£¬ £¬£¬£¬£¬£¬ÔÚÏÖʵµÄÓïÒôʶ±ðʹÃüÖУ¬£¬£¬ £¬£¬£¬£¬£¬AEDµÄ×¢ÖØÁ¦»úÖÆµÄ¶ÔÆëЧ¹û£¬£¬£¬ £¬£¬£¬£¬£¬ÓÐʱҲ»á±»ÔëÉùÆÆË𡣡£¡£¡£¡£¡£¡£¡£

CTCµÄ½âÂëËÙÂʱÈAED¿ì£¬£¬£¬ £¬£¬£¬£¬£¬¿ÉÊÇÓÉÓÚÊä³öµ¥Î»Ö®¼äµÄÌõ¼þ×ÔÁ¦ÐÔºÍȱ·¦ÓïÑÔÄ£×ÓµÄÔ¼Êø£¬£¬£¬ £¬£¬£¬£¬£¬Æäʶ±ðÂÊÓÐÌáÉý¿Õ¼ä¡£¡£¡£¡£¡£¡£¡£¡£

ÏÖÔÚÓÐһЩ¹ØÓÚÈÚºÏAEDºÍCTCÁ½ÖÖ¿ò¼ÜµÄÑо¿£¬£¬£¬ £¬£¬£¬£¬£¬»ùÓÚ±àÂëÆ÷¹²ÏíµÄ¶àʹÃüѧϰ£¬£¬£¬ £¬£¬£¬£¬£¬Ê¹ÓÃCTCºÍAEDÄ¿µÄͬʱѵÁ·¡£¡£¡£¡£¡£¡£¡£¡£ÔÚÄ£×ӽṹÉÏ£¬£¬£¬ £¬£¬£¬£¬£¬TransformerÒѾ­ÔÚ»úе·­Ò룬£¬£¬ £¬£¬£¬£¬£¬ÓïÒôʶ±ð£¬£¬£¬ £¬£¬£¬£¬£¬ºÍÅÌËã»úÊÓ¾õÁìÓòÏÔʾÁ˼«´óµÄÓÅÊÆ¡£¡£¡£¡£¡£¡£¡£¡£

Ã÷ÂԿƼ¼¼¯ÍŸ߼¶×ܼࡢÓïÒôÊÖÒÕÈÏÕæÈËÖì»á·åÏÈÈÝ£¬£¬£¬ £¬£¬£¬£¬£¬Ã÷ÂÔÍŶÓÖØµãÑо¿ÁËÔÚCTCºÍAEDÈÚºÏѵÁ·¿ò¼ÜÏ£¬£¬£¬ £¬£¬£¬£¬£¬ÔõÑùʹÓÃTransformerÄ£×ÓÀ´Ìá¸ßʶ±ðЧ¹û¡£¡£¡£¡£¡£¡£¡£¡£

Ã÷ÂԿƼ¼BlockformerÓïÒôʶ±ðÄ£×ÓÔÚAISHELL-1²âÊÔ¼¯ÉÏÈ¡µÃSOTAЧ¹û

Ã÷ÂÔÍŶÓͨ¹ý¿ÉÊÓ»¯ÆÊÎöÁ˲î±ðBLOCKºÍHEADÖ®¼äµÄ×¢ÖØÁ¦ÐÅÏ¢£¬£¬£¬ £¬£¬£¬£¬£¬ÕâЩÐÅÏ¢µÄ¶àÑùÐÔÊǺÜÊÇÓÐ×ÊÖúµÄ£¬£¬£¬ £¬£¬£¬£¬£¬±àÂëÆ÷Ï¢ÕùÂëÆ÷ÖÐÿ¸öBLOCKµÄÊä³öÐÅÏ¢²¢²»ÍêÈ«°üÀ¨£¬£¬£¬ £¬£¬£¬£¬£¬Ò²¿ÉÄÜÊÇ»¥²¹µÄ¡£¡£¡£¡£¡£¡£¡£¡££¨https://doi.org/10.48550/arXiv.2207.11697£©

»ùÓÚÕâÖÖ¶´²ì£¬£¬£¬ £¬£¬£¬£¬£¬Ã÷ÂÔÍŶÓÌá³öÁËÒ»ÖÖÄ£×ӽṹ£¬£¬£¬ £¬£¬£¬£¬£¬Block-augmented Transformer £¨BlockFormer£©£¬£¬£¬ £¬£¬£¬£¬£¬Ñо¿ÁËÔõÑùÒÔ²ÎÊý»¯µÄ·½·¨»¥²¹ÈÚºÏÿ¸ö¿éµÄ»ù±¾ÐÅÏ¢£¬£¬£¬ £¬£¬£¬£¬£¬ÊµÏÖÁËWeighted Sum of the Blocks Output£¨Base-WSBO£©ºÍSqueeze-and-Excitation module to WSBO£¨SE-WSBO£©Á½ÖÖblock¼¯³ÉÒªÁì¡£¡£¡£¡£¡£¡£¡£¡£

Ã÷ÂԿƼ¼BlockformerÓïÒôʶ±ðÄ£×ÓÔÚAISHELL-1²âÊÔ¼¯ÉÏÈ¡µÃSOTAЧ¹û
Ã÷ÂԿƼ¼BlockformerÓïÒôʶ±ðÄ£×ÓÔÚAISHELL-1²âÊÔ¼¯ÉÏÈ¡µÃSOTAЧ¹û

ʵÑé֤ʵ£¬£¬£¬ £¬£¬£¬£¬£¬BlockformerÄ£×ÓÔÚÖÐÎÄͨË×»°²âÊÔ¼¯£¨AISHELL-1£©ÉÏ£¬£¬£¬ £¬£¬£¬£¬£¬²»Ê¹ÓÃÓïÑÔÄ£×ÓµÄÇéÐÎÏÂʵÏÖÁË4.35%µÄCER£¬£¬£¬ £¬£¬£¬£¬£¬Ê¹ÓÃÓïÑÔÄ£×ÓʱµÖ´ïÁË4.10%µÄCER¡£¡£¡£¡£¡£¡£¡£¡£

Ã÷ÂԿƼ¼BlockformerÓïÒôʶ±ðÄ£×ÓÔÚAISHELL-1²âÊÔ¼¯ÉÏÈ¡µÃSOTAЧ¹û
Ã÷ÂԿƼ¼BlockformerÓïÒôʶ±ðÄ£×ÓÔÚAISHELL-1²âÊÔ¼¯ÉÏÈ¡µÃSOTAЧ¹û
Ã÷ÂԿƼ¼BlockformerÓïÒôʶ±ðÄ£×ÓÔÚAISHELL-1²âÊÔ¼¯ÉÏÈ¡µÃSOTAЧ¹û

AISHELL-1ÊÇÏ£¶û±´¿Ç2017Ä꿪ԴµÄÖÐÎÄͨË×»°ÓïÒôÊý¾Ý¿â£¬£¬£¬ £¬£¬£¬£¬£¬Â¼Òôʱ³¤178Сʱ£¬£¬£¬ £¬£¬£¬£¬£¬ÓÉ400ÃûÖйú²î±ðµØÇøÓïÑÔÈ˾ÙÐÐÂ¼ÖÆ¡£¡£¡£¡£¡£¡£¡£¡£¸ÃÊý¾Ý¿âÉæ¼°ÖÇÄܼҾӡ¢ÎÞÈ˼ÝÊ»¡¢¹¤ÒµÉú²úµÈ11¸öÁìÓò£¬£¬£¬ £¬£¬£¬£¬£¬±»¸ßƵӦÓÃÔÚÓïÒôÊÖÒÕ¿ª·¢¼°ÊµÑéÖУ¬£¬£¬ £¬£¬£¬£¬£¬Êǵ±½ñÖÐÎÄÓïÒôʶ±ðÆÀ²âµÄȨÍþÊý¾Ý¿âÖ®Ò»¡£¡£¡£¡£¡£¡£¡£¡£
AI WikiÍøÕ¾Papers With CodeÏÔʾ£¬£¬£¬ £¬£¬£¬£¬£¬BlockformerÔÚAISHELL-1ÉÏÈ¡µÃSOTAµÄʶ±ðЧ¹û£¬£¬£¬ £¬£¬£¬£¬£¬×Ö´íÂʽµµÍµ½4.10%£¨Ê¹ÓÃÓïÑÔÄ£×Óʱ£©¡£¡£¡£¡£¡£¡£¡£¡£

£¨https://paperswithcode.com/sota/speech-recognition-on-aishell-1£©
Ã÷ÂԿƼ¼¼¯ÍÅCTOºÂ½ÜÌåÏÖ£¬£¬£¬ £¬£¬£¬£¬£¬Ã÷ÂԵĻỰÖÇÄܲúÆ·Õë¶Ô»ùÓÚÏßÉÏÆó΢»á»°ºÍÏßÏÂÃŵê»á»°µÄÏúÊÛ³¡¾°£¬£¬£¬ £¬£¬£¬£¬£¬ÓïÒôʶ±ðÍŶӾ۽¹ÃÀ×±¡¢Æû³µ¡¢½ÌÓýµÈÐÐÒµµÄ³¡¾°ÓÅ»¯ºÍ¶¨ÖÆÑµÁ·£¬£¬£¬ £¬£¬£¬£¬£¬¿ÉÊÇÒ²²»ËÉ¿ª¶ÔͨÓÃÓïÒôʶ±ðпò¼Ü¡¢ÐÂÄ£×ÓµÄ̽Ë÷£¬£¬£¬ £¬£¬£¬£¬£¬BlockformerÄ£×ÓµÄÕâ¸öSOTAЧ¹ûΪÓïÒôʶ±ðµÄ¶¨ÖÆÓÅ»¯ÌṩÁËÒ»¸ö¸ßÆðµã£¬£¬£¬ £¬£¬£¬£¬£¬Ã÷ÂÔ¼´½«¿ªÔ´Blockformer¡£¡£¡£¡£¡£¡£¡£¡£

ÐÅÏ¢Ìîд

*ÊÖ»úºÅÂë:

ÇëѡЭÒé

¡¾ÍøÕ¾µØÍ¼¡¿¡¾sitemap¡¿