- 
          AudioVisual Recognition
  (Embedded)  
  (Server Based)
 
          (Combination of Speaker, Speech, Face Recognition, and Object Detection and Recognition with a single interface) 
                  
      - 
          Large-Vocabulary Speech Recognition
  (Embedded)  
  (Server Based)
 
Initially available for English, Spanish, Mandarin, Arabic, and German, is now available for 100+ languages 
          Also includes multilinguagl support and code-switching 
          (Customizable domain full transcription ~ 300,000+ word vocabulary) 
                  
      - 
          Speaker Recognition
  (Embedded)
  (Server Based)
 
          (Language- and Text-Independent, aka: Speaker Biometrics, Voice Biometrics, or SIV) 
          Recipient: Frost & Sullivan Award 2011 
                
        RecoMadeEasy® 
      
      
        Server-Based Speaker Recognition
      
      
      
      Platform:  
      
      Language- and Text-Independence: The speaker recognition system is completely text- and language-independent. This means that a user may enroll her/his voice into the system in one language and be identified or verified in a completely different language. This allows the engine to be able to handle authentication and identification processes across any number of languages.
      
           
      The RecoMadeEasy® Speaker Recognition (SPKR) (SIV) System
      is an award-winning engine developed entirely by Recognition
      Technologies, Inc. which
      currently runs on Linux, Mac, and Windows operating systems.
      The engine is compatible with all microphone devices and most audio file formats.  
  The SIV
      system is fully integrated with our IVR system which is
      compatible with 
      Dialogic®
      telephony T1 and E1 cards as well as their analog cards.
      It may also be run in a stand-alone environment independent of our
      IVR system in a telephony or non-telephony setting.
       
      
          This is a state-of-the-art language and
          text-independent speaker recognition system (voice biometrics system) which has been developed
          to work in different environments.  Large-Scale and Small-Scale
          versions of this speaker identification and speaker verification
          (SIV) engine have been developed over many years of research to work
          in the telephony as well as stand-alone environments.  This speaker
          biometric engine may be customized to fit your exact needs including
          special modifications to fit the operating environment in which
          your related applications run.  Our staff has been actively
          involved in defining speaker recognition (speaker biometric or voice biometrics)
          standards in the VoiceXML and ANSI communities by providing
          detailed consultation to the VoiceXML and M1 committees involved
          in defining the speaker verification and identification standards.
       
      
      Capabilities
       
      The RecoMadeEasy® SIV system operates in 6 different
      modalities: 
       
      - Speaker Identification (Open-Set and Closed-Set)
 
      The speaker enrolls his voice with the system.  The system trains for
      this and other speakers' voices.  Once the speaker returns, the system
      only has to listen to the speaker and will be able to identify the
      speaker's voice among the trained voices it has in the database.  The
      identification process returns an ID for the speaker.  There are two
      different identification approaches.  The simpler one is called
      Closed-Set Identification in which case the ID of the closest voice in
      the database is returned.  In this case, if the speaker is not in the
      database there is a possibility of a mis-tagged ID since the closest
      voice is the database is picked.  The more sophisticated (but harder)
      approach is called Open-Set Identification where the speaker may
      be tagged with an ID from the database or if the speaker has not been
      enrolled in the database, he is rejected as not-enrolled.
      Our SIV engine supports both Open-Set and Closed-Set approaches. 
        
      - Speaker Verification
 
      In this modality, again, the speaker has to enroll his voice.  Once the
      enrollment process is done (recording of about 30 seconds of speech and
      obtaining a positive ID of the speaker), the speaker is added to the
      database.  When the speaker returns, he makes a claim of his identity.
      He will also speak for a few seconds and the speaker's voice is matched
      against the database.  His identity is either authenticated or he is
      rejected as an impostor.  It is important to note that there are two
      possible sources of error; 1. False Acceptance and 2. False Rejection.
      A false acceptance error would happen if the individual is mistakenly
      authenticated.  This is the number that we should try to minimize in
      more security conscious applications.  There is a trade-off between
      the false acceptance and false rejection.  If we reduce the false
      acceptance rate, it means that we are making the security tighter.  This
      will naturally increase the number of false-rejections. False rejections
      could become annoying if they are not limited.
        
      - Speaker Classification and Event Detection
 
      This modality of the engine may be used to classify speakers into
      groups such as gender groups (male/female/child).  Language detection
      may also be viewed as classification.  Age group and many other
      categories may also be used to perform speaker classification.  This may
      also be used to classify or detect events such as beeps, speech, horn,
      auto noise, background noise, etc.
        
      - Speaker Detection
 
      This would be the case where a speaker is already enrolled in the
      database and we would be trying to find the speaker among recordings or
      in a live conversation.
        
      - Speaker Tracking
 
      In this case a speaker's voice is tracked through the conversation and
      the tracking makes sure the speaker stays on-line.
        
      - Speaker Segmentation
 
      This would be used to segment the speech between two or more speakers in
      a conversation.
        
       
      
      
      The Engine May be Used in the Following Ways 
       
        - Standalone engine which may be run through the use of
            command lines and system calls.
 
        - Standalone engine which may be used through a very simple
            C++ SDK and API. This would be most useful for integrating
            the engine into current products and IVR systems.
 
        - As a module of our RecoMadeEasy® (Reco Made Easy) IVR system.
 
        - As a web service using our servers.
 
        - As a web service using your own servers.
 
       
      
      
      Supported Audio Interface 
       
      The following interfaces are natively supported.  However, the speaker
      recognition engine may be used with any audio interface as long as
      the audio is passed to the engine through a third party software such
      as your own IVR system or recording program.  The engine may be used
      in many different scenarios such as a web service, C++ API, and 
      command-line interface.
       
      
        - All Microphone devices
 
        - All Major Audio File Formats
 
        - All Dialogic JCT Telephony cards (T1 and Analog)
 
       
      
      
      Supported Operating Systems 
       
      
        The speaker recognition engine (voice biometrics engine) is available for the following
        operating systems.  The C++ SDK, command-line interface, and web
        services may be used in any of the following systems:
       
      
      Linux (both 32-bit and 64-bit versions are supported) 
       
        - CentOS 8 and 7.9 Linux (Latest)
 
        - Previous CentOS Linux versions: 7.3, 7.2, 7.1, 7.0, 6.6, 6.4, 6.3
        6.2, 5.7, 5.6, 5.4
  
        - Fedora 40 Linux (Latest)
 
        - Previous Fedora Linux versions: 39, 38, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16,
        15, 14, 13, 12, 11, 10, 9, 8, 7, 6, Core 5, Core 4, Core 3, Core 2,
        Core
  
        - Ubuntu 24.04 Linux (Latest)
  
        - Previous Ubuntu Linux versions: 22.04, 20.04, 18.04, 16.04
         
  
        - N.B.: May be made available for other Unix-Like systems upon request
 
       
      Microsoft Windows 
      
        - 64-bit - Windows 7 (Latest)
 
        - 32-bit - Windows XP
 
       
      Apple Macintosh 
      
        - Mac OS X - 10.8.4 (Latest)
 
        - Previous OS X versions: 10.6.8, 10.5
  
       
      
      An evaluation account for the hosted version of
      the RecoMadeEasy® Speaker
      Recognition software may be made available to interested
      organizations.
 
      
        
      - 
          Face Recognition
  (Embedded)  
  (Server Based)
 
          (Face detection and recognition) 
                  
      - 
          Object Recognition
  (Embedded)  
  (Server Based)
 
          (Object detection and recognition) 
                 
     
   
 |