btt²©ÌìÌÃ

EN

Ã÷ÂԿƼ¼BlockformerÓïÒôʶ±ðÄ£×ÓÔÚAISHELL-1²âÊÔ¼¯ÉÏÈ¡µÃSOTAЧ¹û

2022-09-13

Ã÷ÂԿƼ¼¼´½«¿ªÔ´BlockformerÓïÒôʶ±ðÄ£×Ó£¬£¬ £¬£¬£¬£¬£¬ÌáÉýÏúÊÛÀú³ÌÖеĻỰÖÇÄÜ£¬£¬ £¬£¬£¬£¬£¬ÖúÁ¦¸÷ÐÐÒµÊýÖÇ»¯×ªÐÍ¡£¡£¡£¡£ ¡£¡£¡£

Éî¶ÈѧϰÒÑÀÖ³ÉÓ¦ÓÃÓÚÓïÒôʶ±ð£¬£¬ £¬£¬£¬£¬£¬ÖÖÖÖÉñ¾­ÍøÂç±»¸÷ÈËÆÕ±éÑо¿ºÍ̽Ë÷£¬£¬ £¬£¬£¬£¬£¬ÀýÈ磬£¬ £¬£¬£¬£¬£¬Éî¶ÈÉñ¾­ÍøÂ磨Deep Neural Network£¬£¬ £¬£¬£¬£¬£¬DNN£©¡¢¾í»ýÉñ¾­ÍøÂ磨Convolutional Neural Network£¬£¬ £¬£¬£¬£¬£¬CNN£©¡¢Ñ­»·Éñ¾­ÍøÂ磨Recurrent Neural Network£¬£¬ £¬£¬£¬£¬£¬RNN£©ºÍ¶Ëµ½¶ËµÄÉñ¾­ÍøÂçÄ£×Ó¡£¡£¡£¡£ ¡£¡£¡£

ÏÖÔÚ£¬£¬ £¬£¬£¬£¬£¬Ö÷ÒªÓÐÈýÖֶ˵½¶ËµÄÄ£×Ó¿ò¼Ü£ºÉñ¾­ÍøÂç´«¸ÐÆ÷£¨Neural Transducer£¬£¬ £¬£¬£¬£¬£¬NT£©£¬£¬ £¬£¬£¬£¬£¬»ùÓÚ×¢ÖØÁ¦µÄ±àÂëÆ÷-½âÂëÆ÷£¨Attention-based Encoder Decoder£¬£¬ £¬£¬£¬£¬£¬AED£©ºÍÅþÁ¬Ê±Ðò·ÖÀࣨConnectionist Temporal Classification£¬£¬ £¬£¬£¬£¬£¬CTC£©¡£¡£¡£¡£ ¡£¡£¡£

NTÊÇCTCµÄÔöÇ¿°æ±¾£¬£¬ £¬£¬£¬£¬£¬ÒýÈëÁËÕ¹ÍûÍøÂçÄ£¿£¿£¿£¿£¿é£¬£¬ £¬£¬£¬£¬£¬¿ÉÀà±È¹Å°åÓïÒôʶ±ð¿ò¼ÜÖеÄÓïÑÔÄ£×Ó£¬£¬ £¬£¬£¬£¬£¬½âÂëÆ÷ÐèÒª°ÑÏÈǰչÍûµÄÀúÊ·×÷ΪÉÏÏÂÎÄÊäÈë¡£¡£¡£¡£ ¡£¡£¡£NTѵÁ·²»ÎȹÌ£¬£¬ £¬£¬£¬£¬£¬ÐèÒª¸ü¶àÄڴ棬£¬ £¬£¬£¬£¬£¬Õâ¿ÉÄÜ»áÏÞÖÆÑ·üçٶÈ¡£¡£¡£¡£ ¡£¡£¡£

AEDÓɱàÂëÆ÷£¬£¬ £¬£¬£¬£¬£¬½âÂëÆ÷ºÍ×¢ÖØÁ¦»úÖÆÄ£¿£¿£¿£¿£¿é×é³É£¬£¬ £¬£¬£¬£¬£¬Ç°Õß¶ÔÉùÑ§ÌØÕ÷¾ÙÐбàÂ룬£¬ £¬£¬£¬£¬£¬½âÂëÆ÷ÌìÉú¾ä×Ó£¬£¬ £¬£¬£¬£¬£¬×¢ÖØÁ¦»úÖÆÓÃÀ´¶ÔÆë±àÂëÆ÷ÊäÈëÌØÕ÷Ï¢ÕùÂë״̬¡£¡£¡£¡£ ¡£¡£¡£ÒµÄÚ²»ÉÙASRϵͳ¼Ü¹¹»ùÓÚAED¡£¡£¡£¡£ ¡£¡£¡£È»¶ø£¬£¬ £¬£¬£¬£¬£¬AEDÄ£×ÓÖð¸öµ¥Î»Êä³ö£¬£¬ £¬£¬£¬£¬£¬ÆäÖÐÿ¸öµ¥Î»¼ÈÈ¡¾öÓÚÏÈËÞÊÀ³ÉµÄЧ¹û£¬£¬ £¬£¬£¬£¬£¬ÓÖÒÀÀµºóÐøµÄÉÏÏÂÎÄ£¬£¬ £¬£¬£¬£¬£¬Õâ»áµ¼ÖÂʶ±ðÑÓ³Ù¡£¡£¡£¡£ ¡£¡£¡£

ÁíÍ⣬£¬ £¬£¬£¬£¬£¬ÔÚÏÖʵµÄÓïÒôʶ±ðʹÃüÖУ¬£¬ £¬£¬£¬£¬£¬AEDµÄ×¢ÖØÁ¦»úÖÆµÄ¶ÔÆëЧ¹û£¬£¬ £¬£¬£¬£¬£¬ÓÐʱҲ»á±»ÔëÉùÆÆË𡣡£¡£¡£ ¡£¡£¡£

CTCµÄ½âÂëËÙÂʱÈAED¿ì£¬£¬ £¬£¬£¬£¬£¬¿ÉÊÇÓÉÓÚÊä³öµ¥Î»Ö®¼äµÄÌõ¼þ×ÔÁ¦ÐÔºÍȱ·¦ÓïÑÔÄ£×ÓµÄÔ¼Êø£¬£¬ £¬£¬£¬£¬£¬Æäʶ±ðÂÊÓÐÌáÉý¿Õ¼ä¡£¡£¡£¡£ ¡£¡£¡£

ÏÖÔÚÓÐһЩ¹ØÓÚÈÚºÏAEDºÍCTCÁ½ÖÖ¿ò¼ÜµÄÑо¿£¬£¬ £¬£¬£¬£¬£¬»ùÓÚ±àÂëÆ÷¹²ÏíµÄ¶àʹÃüѧϰ£¬£¬ £¬£¬£¬£¬£¬Ê¹ÓÃCTCºÍAEDÄ¿µÄͬʱѵÁ·¡£¡£¡£¡£ ¡£¡£¡£ÔÚÄ£×ӽṹÉÏ£¬£¬ £¬£¬£¬£¬£¬TransformerÒѾ­ÔÚ»úе·­Ò룬£¬ £¬£¬£¬£¬£¬ÓïÒôʶ±ð£¬£¬ £¬£¬£¬£¬£¬ºÍÅÌËã»úÊÓ¾õÁìÓòÏÔʾÁ˼«´óµÄÓÅÊÆ¡£¡£¡£¡£ ¡£¡£¡£

Ã÷ÂԿƼ¼¼¯ÍŸ߼¶×ܼࡢÓïÒôÊÖÒÕÈÏÕæÈËÖì»á·åÏÈÈÝ£¬£¬ £¬£¬£¬£¬£¬Ã÷ÂÔÍŶÓÖØµãÑо¿ÁËÔÚCTCºÍAEDÈÚºÏѵÁ·¿ò¼ÜÏ£¬£¬ £¬£¬£¬£¬£¬ÔõÑùʹÓÃTransformerÄ£×ÓÀ´Ìá¸ßʶ±ðЧ¹û¡£¡£¡£¡£ ¡£¡£¡£

Ã÷ÂԿƼ¼BlockformerÓïÒôʶ±ðÄ£×ÓÔÚAISHELL-1²âÊÔ¼¯ÉÏÈ¡µÃSOTAЧ¹û

Ã÷ÂÔÍŶÓͨ¹ý¿ÉÊÓ»¯ÆÊÎöÁ˲î±ðBLOCKºÍHEADÖ®¼äµÄ×¢ÖØÁ¦ÐÅÏ¢£¬£¬ £¬£¬£¬£¬£¬ÕâЩÐÅÏ¢µÄ¶àÑùÐÔÊǺÜÊÇÓÐ×ÊÖúµÄ£¬£¬ £¬£¬£¬£¬£¬±àÂëÆ÷Ï¢ÕùÂëÆ÷ÖÐÿ¸öBLOCKµÄÊä³öÐÅÏ¢²¢²»ÍêÈ«°üÀ¨£¬£¬ £¬£¬£¬£¬£¬Ò²¿ÉÄÜÊÇ»¥²¹µÄ¡£¡£¡£¡£ ¡£¡£¡££¨https://doi.org/10.48550/arXiv.2207.11697£©

»ùÓÚÕâÖÖ¶´²ì£¬£¬ £¬£¬£¬£¬£¬Ã÷ÂÔÍŶÓÌá³öÁËÒ»ÖÖÄ£×ӽṹ£¬£¬ £¬£¬£¬£¬£¬Block-augmented Transformer £¨BlockFormer£©£¬£¬ £¬£¬£¬£¬£¬Ñо¿ÁËÔõÑùÒÔ²ÎÊý»¯µÄ·½·¨»¥²¹ÈÚºÏÿ¸ö¿éµÄ»ù±¾ÐÅÏ¢£¬£¬ £¬£¬£¬£¬£¬ÊµÏÖÁËWeighted Sum of the Blocks Output£¨Base-WSBO£©ºÍSqueeze-and-Excitation module to WSBO£¨SE-WSBO£©Á½ÖÖblock¼¯³ÉÒªÁì¡£¡£¡£¡£ ¡£¡£¡£

Ã÷ÂԿƼ¼BlockformerÓïÒôʶ±ðÄ£×ÓÔÚAISHELL-1²âÊÔ¼¯ÉÏÈ¡µÃSOTAЧ¹û
Ã÷ÂԿƼ¼BlockformerÓïÒôʶ±ðÄ£×ÓÔÚAISHELL-1²âÊÔ¼¯ÉÏÈ¡µÃSOTAЧ¹û

ʵÑé֤ʵ£¬£¬ £¬£¬£¬£¬£¬BlockformerÄ£×ÓÔÚÖÐÎÄͨË×»°²âÊÔ¼¯£¨AISHELL-1£©ÉÏ£¬£¬ £¬£¬£¬£¬£¬²»Ê¹ÓÃÓïÑÔÄ£×ÓµÄÇéÐÎÏÂʵÏÖÁË4.35%µÄCER£¬£¬ £¬£¬£¬£¬£¬Ê¹ÓÃÓïÑÔÄ£×ÓʱµÖ´ïÁË4.10%µÄCER¡£¡£¡£¡£ ¡£¡£¡£

Ã÷ÂԿƼ¼BlockformerÓïÒôʶ±ðÄ£×ÓÔÚAISHELL-1²âÊÔ¼¯ÉÏÈ¡µÃSOTAЧ¹û
Ã÷ÂԿƼ¼BlockformerÓïÒôʶ±ðÄ£×ÓÔÚAISHELL-1²âÊÔ¼¯ÉÏÈ¡µÃSOTAЧ¹û
Ã÷ÂԿƼ¼BlockformerÓïÒôʶ±ðÄ£×ÓÔÚAISHELL-1²âÊÔ¼¯ÉÏÈ¡µÃSOTAЧ¹û

AISHELL-1ÊÇÏ£¶û±´¿Ç2017Ä꿪ԴµÄÖÐÎÄͨË×»°ÓïÒôÊý¾Ý¿â£¬£¬ £¬£¬£¬£¬£¬Â¼Òôʱ³¤178Сʱ£¬£¬ £¬£¬£¬£¬£¬ÓÉ400ÃûÖйú²î±ðµØÇøÓïÑÔÈ˾ÙÐÐÂ¼ÖÆ¡£¡£¡£¡£ ¡£¡£¡£¸ÃÊý¾Ý¿âÉæ¼°ÖÇÄܼҾӡ¢ÎÞÈ˼ÝÊ»¡¢¹¤ÒµÉú²úµÈ11¸öÁìÓò£¬£¬ £¬£¬£¬£¬£¬±»¸ßƵӦÓÃÔÚÓïÒôÊÖÒÕ¿ª·¢¼°ÊµÑéÖУ¬£¬ £¬£¬£¬£¬£¬Êǵ±½ñÖÐÎÄÓïÒôʶ±ðÆÀ²âµÄȨÍþÊý¾Ý¿âÖ®Ò»¡£¡£¡£¡£ ¡£¡£¡£
AI WikiÍøÕ¾Papers With CodeÏÔʾ£¬£¬ £¬£¬£¬£¬£¬BlockformerÔÚAISHELL-1ÉÏÈ¡µÃSOTAµÄʶ±ðЧ¹û£¬£¬ £¬£¬£¬£¬£¬×Ö´íÂʽµµÍµ½4.10%£¨Ê¹ÓÃÓïÑÔÄ£×Óʱ£©¡£¡£¡£¡£ ¡£¡£¡£

£¨https://paperswithcode.com/sota/speech-recognition-on-aishell-1£©
Ã÷ÂԿƼ¼¼¯ÍÅCTOºÂ½ÜÌåÏÖ£¬£¬ £¬£¬£¬£¬£¬Ã÷ÂԵĻỰÖÇÄܲúÆ·Õë¶Ô»ùÓÚÏßÉÏÆó΢»á»°ºÍÏßÏÂÃŵê»á»°µÄÏúÊÛ³¡¾°£¬£¬ £¬£¬£¬£¬£¬ÓïÒôʶ±ðÍŶӾ۽¹ÃÀ×±¡¢Æû³µ¡¢½ÌÓýµÈÐÐÒµµÄ³¡¾°ÓÅ»¯ºÍ¶¨ÖÆÑµÁ·£¬£¬ £¬£¬£¬£¬£¬¿ÉÊÇÒ²²»ËÉ¿ª¶ÔͨÓÃÓïÒôʶ±ðпò¼Ü¡¢ÐÂÄ£×ÓµÄ̽Ë÷£¬£¬ £¬£¬£¬£¬£¬BlockformerÄ£×ÓµÄÕâ¸öSOTAЧ¹ûΪÓïÒôʶ±ðµÄ¶¨ÖÆÓÅ»¯ÌṩÁËÒ»¸ö¸ßÆðµã£¬£¬ £¬£¬£¬£¬£¬Ã÷ÂÔ¼´½«¿ªÔ´Blockformer¡£¡£¡£¡£ ¡£¡£¡£

ÐÅÏ¢Ìîд

*ÊÖ»úºÅÂë:

ÇëѡЭÒé

¡¾ÍøÕ¾µØÍ¼¡¿¡¾sitemap¡¿