RT_{60} can be used to estimate the critical distance in applications where the early and late reflections influence the reverberation time (cf. Ref.[2,3]).

When designing and evaluating voice enhancement-based product prototypes for applications in real-life acoustic scenes, one of the prerequisites is to outline acoustic-related specifications such as the distance from the signal source (for example, speaker’s mouth) to voice terminal-based microphones. This prerequisite or requirement is closely related to the acoustic scene/room conditions, as considered from the viewpoint of the acoustic energy split: direct-path energy versus reverberant energy.

## Direct Path

In the case when the most of acoustic energy reaching the microphones is related to the direct path, the acoustic wave reflections, early ones and late ones (the latter constitute the reverberation effect) are of the secondary consideration when designing noise reduction and echo cancellation components. Being in the group of effects of the secondary consideration does not mean, of course, that these effects are negligible. This only means that these effects are, on the logarithmic scale, of somewhat lesser direct impact on the voice quality, when compared to the effects associated with the direct path.

One specific example can be brought here to explain further this case. When designing an echo canceller for Bluetooth-based device, the main preoccupation is echo path coverage associated with the direct coupling between the Bluetooth-based earpiece and the microphone. Acoustic reflections in the room do not affect the speech quality very much, as the effects of these reflections can be addressed by the AEC-based non-linear processor (cf. Ref. [1]).

## Early and Late Reflections

In the opposite case, i.e., when the most of acoustic energy reaching the microphones is related to early and late reflections, then practical considerations lead to enhancing the voice enhancement solutions so they are fully functional in more stringent acoustical environment. In such cases the need to address, mitigate and compensate individual and late reflections is of primary importance from the viewpoint of voice quality.

## Critical Distance as a Function of RT_{60}

The sound energy density, E_{d}, defined as the acoustic energy per unit volume (in units J/m^{3}), associated with the direct path is given by the following approximate formula (cf. Ref.[4]):

where

- Q is the acoustic wave source directivity (and for the omni-directional source – an acceptable approximation for human mouth and head, or for Head and Torso Simulator (HATS) of Brüel & Kjær – it is assumed that Q = 1),
- W
_{s}is the acoustic power (in watts) produced by the source, c (in m/s) is the speed of sound, and - D (in m) is the distance from the source to the point of reference (i.e. , to the observation point, typically the location of the primary microphone located on the device under consideration).

Similarly, the sound energy density, E_{r}, associated with the reverberation effects in the room is given by the following approximate formula (Ref.[4]):

where R is a room constant related to the acoustic wave absorption and it is defined by the following approximate formula:

where *α* denotes the average absorption coefficient of all surfaces combined in the room and A is the total absorption surface area.

The critical distance D_{c} is defined as the distance from the source to the observation point at which both sound energy densities, E_{d} and E_{r}, are equal. Thus, by equating Eq.1 and Eq.2 we arrive at:

It is shown in Ref.[5] that the critical distance can also be expressed in terms of Q, V, and the reverberation time RT_{60} as follows:

where V denotes the room volume (in m^{3}).

Figure 1 gives an example of E_{d} and E_{r} versus D as well as the resulting value of D_{c}. The sound source is normalized to 1 watt; the room dimensions are 3m x 4m x 5m with *α* = 0.3 (which gives RT_{60} = 290 ms, approximately); c is 344 m/s. The Dc is 0.9 m as calculated based on Eq. 5.

Based on the above example (Figure 1) we can conclude that in a typical acoustical environment, to ensure that the early and late reflections do not play a very noticeable role in voice quality, the distance between the sound source and the microphone should be at least 10 times smaller than estimated 90 cm. The considerations above do not include the dependence of RT_{60} on frequency. Typically RT_{60} is greater for low frequency bands, thus, when estimating the “near-talk” distance limits, additional estimations are required.

VOCAL Technologies practices include characterization of acoustical environment where the voice enhancement devices are verified. These practices include estimations of RT_{60} and D_{c}.

## References

- NON-LINEAR PROCESSING IN ECHO CANCELLATION
- RT60 ESTIMATION
- ACOUSTIC ECHO PATHS CHARACTERISTICS AND AEC
- Speech Dereverberation (Signal and Communication Technology series), Patrick A. Naylor and Nikolay D. Gaubitch (editors), Springer-Verlag London Limited 2010
- Room Acoustics, by H. Kuttruff Taylor & Francis (2000); 4
^{th}Edition