diff --git a/COPYING b/COPYING
index c7aea18..9b4cfd7 100644
--- a/COPYING
+++ b/COPYING
@@ -1,6 +1,287 @@
 		    GNU GENERAL PUBLIC LICENSE
 		       Version 2, June 1991
 
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.
+ 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA
+
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+			    Preamble
+
+  The licenses for most software are designed to take away your
+freedom to share and change it.  By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users.  This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it.  (Some other Free Software Foundation software is covered by
+the GNU Library General Public License instead.)  You can apply it to
+your programs, too.
+
+  When we speak of free software, we are referring to freedom, not
+price.  Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+  To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+  For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have.  You must make sure that they, too, receive or can get the
+source code.  And you must show them these terms so they know their
+rights.
+
+  We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+  Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software.  If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+  Finally, any free program is threatened constantly by software
+patents.  We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary.  To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+  The precise terms and conditions for copying, distribution and
+modification follow.
+
+		    GNU GENERAL PUBLIC LICENSE
+   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+  0. This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License.  The "Program", below,
+refers to any such program or work, and a "work based on the Program"
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language.  (Hereinafter, translation is included without limitation in
+the term "modification".)  Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope.  The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+  1. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+
+  2. You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+    a) You must cause the modified files to carry prominent notices
+    stating that you changed the files and the date of any change.
+
+    b) You must cause any work that you distribute or publish, that in
+    whole or in part contains or is derived from the Program or any
+    part thereof, to be licensed as a whole at no charge to all third
+    parties under the terms of this License.
+
+    c) If the modified program normally reads commands interactively
+    when run, you must cause it, when started running for such
+    interactive use in the most ordinary way, to print or display an
+    announcement including an appropriate copyright notice and a
+    notice that there is no warranty (or else, saying that you provide
+    a warranty) and that users may redistribute the program under
+    these conditions, and telling the user how to view a copy of this
+    License.  (Exception: if the Program itself is interactive but
+    does not normally print such an announcement, your work based on
+    the Program is not required to print an announcement.)
+
+These requirements apply to the modified work as a whole.  If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works.  But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+
+  3. You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+    a) Accompany it with the complete corresponding machine-readable
+    source code, which must be distributed under the terms of Sections
+    1 and 2 above on a medium customarily used for software interchange; or,
+
+    b) Accompany it with a written offer, valid for at least three
+    years, to give any third party, for a charge no more than your
+    cost of physically performing source distribution, a complete
+    machine-readable copy of the corresponding source code, to be
+    distributed under the terms of Sections 1 and 2 above on a medium
+    customarily used for software interchange; or,
+
+    c) Accompany it with the information you received as to the offer
+    to distribute corresponding source code.  (This alternative is
+    allowed only for noncommercial distribution and only if you
+    received the program in object code or executable form with such
+    an offer, in accord with Subsection b above.)
+
+The source code for a work means the preferred form of the work for
+making modifications to it.  For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable.  However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+
+  4. You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License.  Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+  5. You are not required to accept this License, since you have not
+signed it.  However, nothing else grants you permission to modify or
+distribute the Program or its derivative works.  These actions are
+prohibited by law if you do not accept this License.  Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+  6. Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions.  You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+  7. If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License.  If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all.  For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+
+It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices.  Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+
+This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+
+  8. If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded.  In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+  9. The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time.  Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number.  If the Program
+specifies a version number of this License which applies to it and "any
+later version", you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation.  If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free Software
+Foundation.
+
+  10. If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission.  For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this.  Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+			    NO WARRANTY
+
+  11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+  12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
+
+		     END OF TERMS AND CONDITIONS
+		    GNU GENERAL PUBLIC LICENSE
+		       Version 2, June 1991
+
  Copyright (C) 1989, 1991 Free Software Foundation, Inc.
                           675 Mass Ave, Cambridge, MA 02139, USA
  Everyone is permitted to copy and distribute verbatim copies
diff --git a/Makefile.in b/Makefile.in
index 5107b5c..283e19a 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -56,12 +56,12 @@ LIBRARY_INCLUDES := \
 	src/base/RingBuffer.h \
 	src/base/Scavenger.h \
 	src/dsp/AudioCurveCalculator.h \
-	src/dsp/CompoundAudioCurve.h \
-	src/dsp/ConstantAudioCurve.h \
-	src/dsp/HighFrequencyAudioCurve.h \
-	src/dsp/PercussiveAudioCurve.h \
-	src/dsp/SilentAudioCurve.h \
-	src/dsp/SpectralDifferenceAudioCurve.h \
+	src/audiocurves/CompoundAudioCurve.h \
+	src/audiocurves/ConstantAudioCurve.h \
+	src/audiocurves/HighFrequencyAudioCurve.h \
+	src/audiocurves/PercussiveAudioCurve.h \
+	src/audiocurves/SilentAudioCurve.h \
+	src/audiocurves/SpectralDifferenceAudioCurve.h \
 	src/dsp/Resampler.h \
 	src/dsp/FFT.h \
 	src/dsp/MovingMedian.h \
@@ -79,12 +79,12 @@ LIBRARY_SOURCES := \
 	src/StretchCalculator.cpp \
 	src/base/Profiler.cpp \
 	src/dsp/AudioCurveCalculator.cpp \
-	src/dsp/CompoundAudioCurve.cpp \
-	src/dsp/SpectralDifferenceAudioCurve.cpp \
-	src/dsp/HighFrequencyAudioCurve.cpp \
-	src/dsp/SilentAudioCurve.cpp \
-	src/dsp/ConstantAudioCurve.cpp \
-	src/dsp/PercussiveAudioCurve.cpp \
+	src/audiocurves/CompoundAudioCurve.cpp \
+	src/audiocurves/SpectralDifferenceAudioCurve.cpp \
+	src/audiocurves/HighFrequencyAudioCurve.cpp \
+	src/audiocurves/SilentAudioCurve.cpp \
+	src/audiocurves/ConstantAudioCurve.cpp \
+	src/audiocurves/PercussiveAudioCurve.cpp \
 	src/dsp/Resampler.cpp \
 	src/dsp/FFT.cpp \
 	src/system/Allocators.cpp \
@@ -178,50 +178,50 @@ src/rubberband-c.o: rubberband/RubberBandStretcher.h
 src/RubberBandStretcher.o: src/StretcherImpl.h
 src/RubberBandStretcher.o: rubberband/RubberBandStretcher.h src/dsp/Window.h
 src/RubberBandStretcher.o: src/dsp/SincWindow.h src/dsp/FFT.h
-src/RubberBandStretcher.o: src/dsp/CompoundAudioCurve.h
+src/RubberBandStretcher.o: src/audiocurves/CompoundAudioCurve.h
 src/RubberBandStretcher.o: src/dsp/AudioCurveCalculator.h
-src/RubberBandStretcher.o: src/dsp/PercussiveAudioCurve.h
-src/RubberBandStretcher.o: src/dsp/HighFrequencyAudioCurve.h
+src/RubberBandStretcher.o: src/audiocurves/PercussiveAudioCurve.h
+src/RubberBandStretcher.o: src/audiocurves/HighFrequencyAudioCurve.h
 src/RubberBandStretcher.o: src/dsp/SampleFilter.h src/base/RingBuffer.h
 src/RubberBandStretcher.o: src/base/Scavenger.h src/system/Thread.h
 src/RubberBandStretcher.o: src/system/sysutils.h
 src/StretcherProcess.o: src/StretcherImpl.h rubberband/RubberBandStretcher.h
 src/StretcherProcess.o: src/dsp/Window.h src/dsp/SincWindow.h src/dsp/FFT.h
-src/StretcherProcess.o: src/dsp/CompoundAudioCurve.h
+src/StretcherProcess.o: src/audiocurves/CompoundAudioCurve.h
 src/StretcherProcess.o: src/dsp/AudioCurveCalculator.h
-src/StretcherProcess.o: src/dsp/PercussiveAudioCurve.h
-src/StretcherProcess.o: src/dsp/HighFrequencyAudioCurve.h
+src/StretcherProcess.o: src/audiocurves/PercussiveAudioCurve.h
+src/StretcherProcess.o: src/audiocurves/HighFrequencyAudioCurve.h
 src/StretcherProcess.o: src/dsp/SampleFilter.h src/base/RingBuffer.h
 src/StretcherProcess.o: src/base/Scavenger.h src/system/Thread.h
-src/StretcherProcess.o: src/system/sysutils.h src/dsp/PercussiveAudioCurve.h
-src/StretcherProcess.o: src/dsp/HighFrequencyAudioCurve.h
-src/StretcherProcess.o: src/dsp/ConstantAudioCurve.h src/StretchCalculator.h
+src/StretcherProcess.o: src/system/sysutils.h src/audiocurves/PercussiveAudioCurve.h
+src/StretcherProcess.o: src/audiocurves/HighFrequencyAudioCurve.h
+src/StretcherProcess.o: src/audiocurves/ConstantAudioCurve.h src/StretchCalculator.h
 src/StretcherProcess.o: src/StretcherChannelData.h src/dsp/Resampler.h
 src/StretcherProcess.o: src/base/Profiler.h src/system/VectorOps.h
 src/StretcherProcess.o: src/system/sysutils.h
 src/StretchCalculator.o: src/StretchCalculator.h src/system/sysutils.h
 src/base/Profiler.o: src/base/Profiler.h src/system/sysutils.h
 src/dsp/AudioCurveCalculator.o: src/dsp/AudioCurveCalculator.h
-src/dsp/CompoundAudioCurve.o: src/dsp/CompoundAudioCurve.h
-src/dsp/CompoundAudioCurve.o: src/dsp/AudioCurveCalculator.h
-src/dsp/CompoundAudioCurve.o: src/dsp/PercussiveAudioCurve.h
-src/dsp/CompoundAudioCurve.o: src/dsp/HighFrequencyAudioCurve.h
-src/dsp/CompoundAudioCurve.o: src/dsp/SampleFilter.h src/dsp/MovingMedian.h
-src/dsp/SpectralDifferenceAudioCurve.o: src/dsp/SpectralDifferenceAudioCurve.h
-src/dsp/SpectralDifferenceAudioCurve.o: src/dsp/AudioCurveCalculator.h
-src/dsp/SpectralDifferenceAudioCurve.o: src/dsp/Window.h
-src/dsp/SpectralDifferenceAudioCurve.o: src/system/sysutils.h
-src/dsp/SpectralDifferenceAudioCurve.o: src/system/VectorOps.h
-src/dsp/SpectralDifferenceAudioCurve.o: src/system/sysutils.h
-src/dsp/HighFrequencyAudioCurve.o: src/dsp/HighFrequencyAudioCurve.h
-src/dsp/HighFrequencyAudioCurve.o: src/dsp/AudioCurveCalculator.h
-src/dsp/SilentAudioCurve.o: src/dsp/SilentAudioCurve.h
-src/dsp/SilentAudioCurve.o: src/dsp/AudioCurveCalculator.h
-src/dsp/ConstantAudioCurve.o: src/dsp/ConstantAudioCurve.h
-src/dsp/ConstantAudioCurve.o: src/dsp/AudioCurveCalculator.h
-src/dsp/PercussiveAudioCurve.o: src/dsp/PercussiveAudioCurve.h
-src/dsp/PercussiveAudioCurve.o: src/dsp/AudioCurveCalculator.h
-src/dsp/PercussiveAudioCurve.o: src/system/VectorOps.h src/system/sysutils.h
+src/audiocurves/CompoundAudioCurve.o: src/audiocurves/CompoundAudioCurve.h
+src/audiocurves/CompoundAudioCurve.o: src/dsp/AudioCurveCalculator.h
+src/audiocurves/CompoundAudioCurve.o: src/audiocurves/PercussiveAudioCurve.h
+src/audiocurves/CompoundAudioCurve.o: src/audiocurves/HighFrequencyAudioCurve.h
+src/audiocurves/CompoundAudioCurve.o: src/dsp/SampleFilter.h src/dsp/MovingMedian.h
+src/audiocurves/SpectralDifferenceAudioCurve.o: src/audiocurves/SpectralDifferenceAudioCurve.h
+src/audiocurves/SpectralDifferenceAudioCurve.o: src/dsp/AudioCurveCalculator.h
+src/audiocurves/SpectralDifferenceAudioCurve.o: src/dsp/Window.h
+src/audiocurves/SpectralDifferenceAudioCurve.o: src/system/sysutils.h
+src/audiocurves/SpectralDifferenceAudioCurve.o: src/system/VectorOps.h
+src/audiocurves/SpectralDifferenceAudioCurve.o: src/system/sysutils.h
+src/audiocurves/HighFrequencyAudioCurve.o: src/audiocurves/HighFrequencyAudioCurve.h
+src/audiocurves/HighFrequencyAudioCurve.o: src/dsp/AudioCurveCalculator.h
+src/audiocurves/SilentAudioCurve.o: src/audiocurves/SilentAudioCurve.h
+src/audiocurves/SilentAudioCurve.o: src/dsp/AudioCurveCalculator.h
+src/audiocurves/ConstantAudioCurve.o: src/audiocurves/ConstantAudioCurve.h
+src/audiocurves/ConstantAudioCurve.o: src/dsp/AudioCurveCalculator.h
+src/audiocurves/PercussiveAudioCurve.o: src/audiocurves/PercussiveAudioCurve.h
+src/audiocurves/PercussiveAudioCurve.o: src/dsp/AudioCurveCalculator.h
+src/audiocurves/PercussiveAudioCurve.o: src/system/VectorOps.h src/system/sysutils.h
 src/dsp/Resampler.o: src/dsp/Resampler.h src/system/sysutils.h
 src/dsp/Resampler.o: src/base/Profiler.h
 src/dsp/FFT.o: src/dsp/FFT.h src/system/sysutils.h src/system/Thread.h
@@ -234,10 +234,10 @@ src/system/Thread.o: src/system/Thread.h
 src/StretcherChannelData.o: src/StretcherChannelData.h src/StretcherImpl.h
 src/StretcherChannelData.o: rubberband/RubberBandStretcher.h src/dsp/Window.h
 src/StretcherChannelData.o: src/dsp/SincWindow.h src/dsp/FFT.h
-src/StretcherChannelData.o: src/dsp/CompoundAudioCurve.h
+src/StretcherChannelData.o: src/audiocurves/CompoundAudioCurve.h
 src/StretcherChannelData.o: src/dsp/AudioCurveCalculator.h
-src/StretcherChannelData.o: src/dsp/PercussiveAudioCurve.h
-src/StretcherChannelData.o: src/dsp/HighFrequencyAudioCurve.h
+src/StretcherChannelData.o: src/audiocurves/PercussiveAudioCurve.h
+src/StretcherChannelData.o: src/audiocurves/HighFrequencyAudioCurve.h
 src/StretcherChannelData.o: src/dsp/SampleFilter.h src/base/RingBuffer.h
 src/StretcherChannelData.o: src/base/Scavenger.h src/system/Thread.h
 src/StretcherChannelData.o: src/system/sysutils.h src/dsp/Resampler.h
@@ -245,17 +245,17 @@ src/StretcherChannelData.o: src/system/Allocators.h src/system/VectorOps.h
 src/StretcherChannelData.o: src/system/sysutils.h
 src/StretcherImpl.o: src/StretcherImpl.h rubberband/RubberBandStretcher.h
 src/StretcherImpl.o: src/dsp/Window.h src/dsp/SincWindow.h src/dsp/FFT.h
-src/StretcherImpl.o: src/dsp/CompoundAudioCurve.h
+src/StretcherImpl.o: src/audiocurves/CompoundAudioCurve.h
 src/StretcherImpl.o: src/dsp/AudioCurveCalculator.h
-src/StretcherImpl.o: src/dsp/PercussiveAudioCurve.h
-src/StretcherImpl.o: src/dsp/HighFrequencyAudioCurve.h src/dsp/SampleFilter.h
+src/StretcherImpl.o: src/audiocurves/PercussiveAudioCurve.h
+src/StretcherImpl.o: src/audiocurves/HighFrequencyAudioCurve.h src/dsp/SampleFilter.h
 src/StretcherImpl.o: src/base/RingBuffer.h src/base/Scavenger.h
 src/StretcherImpl.o: src/system/Thread.h src/system/sysutils.h
-src/StretcherImpl.o: src/dsp/PercussiveAudioCurve.h
-src/StretcherImpl.o: src/dsp/HighFrequencyAudioCurve.h
-src/StretcherImpl.o: src/dsp/SpectralDifferenceAudioCurve.h src/dsp/Window.h
+src/StretcherImpl.o: src/audiocurves/PercussiveAudioCurve.h
+src/StretcherImpl.o: src/audiocurves/HighFrequencyAudioCurve.h
+src/StretcherImpl.o: src/audiocurves/SpectralDifferenceAudioCurve.h src/dsp/Window.h
 src/StretcherImpl.o: src/system/VectorOps.h src/system/sysutils.h
-src/StretcherImpl.o: src/dsp/SilentAudioCurve.h src/dsp/ConstantAudioCurve.h
+src/StretcherImpl.o: src/audiocurves/SilentAudioCurve.h src/audiocurves/ConstantAudioCurve.h
 src/StretcherImpl.o: src/dsp/Resampler.h src/StretchCalculator.h
 src/StretcherImpl.o: src/StretcherChannelData.h src/base/Profiler.h
 main/main.o: rubberband/RubberBandStretcher.h src/system/sysutils.h
diff --git a/Makefile.osx b/Makefile.osx
new file mode 100644
index 0000000..aec6095
--- /dev/null
+++ b/Makefile.osx
@@ -0,0 +1,224 @@
+
+CXX		:= g++
+CC		:= gcc
+ARCHFLAGS	:= -arch i386 -arch x86_64
+OPTFLAGS	:= -DNDEBUG -ffast-math -mfpmath=sse -msse -msse2 -O3 -ftree-vectorize
+
+CXXFLAGS	:= $(ARCHFLAGS) $(OPTFLAGS) -I/usr/local/include -DUSE_PTHREADS -DHAVE_VDSP -DUSE_SPEEX -DNO_THREAD_CHECKS -DNO_TIMING -Irubberband -I. -Isrc
+
+LIBRARY_LIBS		:= -framework Accelerate
+
+CFLAGS		:= $(ARCHFLAGS) $(OPTFLAGS)
+LDFLAGS		:= $(ARCHFLAGS) -lpthread $(LDFLAGS)
+
+PROGRAM_LIBS		:= -L/usr/local/lib -lsndfile $(LIBRARY_LIBS)
+VAMP_PLUGIN_LIBS	:= -L/usr/local/lib -lvamp-sdk $(LIBRARY_LIBS)
+LADSPA_PLUGIN_LIBS	:= $(LIBRARY_LIBS)
+
+MKDIR			:= mkdir
+AR			:= ar
+
+DYNAMIC_LDFLAGS		:= -dynamiclib
+DYNAMIC_EXTENSION	:= .dylib
+
+PROGRAM_TARGET 		:= bin/rubberband
+STATIC_TARGET  		:= lib/librubberband.a
+DYNAMIC_TARGET 		:= lib/librubberband$(DYNAMIC_EXTENSION)
+VAMP_TARGET    		:= lib/vamp-rubberband$(DYNAMIC_EXTENSION)
+LADSPA_TARGET  		:= lib/ladspa-rubberband$(DYNAMIC_EXTENSION)
+
+default:	bin lib $(STATIC_TARGET) $(DYNAMIC_TARGET) $(PROGRAM_TARGET)
+
+all:	bin lib $(STATIC_TARGET) $(DYNAMIC_TARGET) $(PROGRAM_TARGET) $(VAMP_TARGET) $(LADSPA_TARGET)
+
+static:		$(STATIC_TARGET)
+dynamic:	$(DYNAMIC_TARGET)
+library:	$(STATIC_TARGET) $(DYNAMIC_TARGET)
+program:	$(PROGRAM_TARGET)
+vamp:		$(VAMP_TARGET)
+ladspa:		$(LADSPA_TARGET)
+
+PUBLIC_INCLUDES := \
+	rubberband/rubberband-c.h \
+	rubberband/RubberBandStretcher.h
+
+LIBRARY_INCLUDES := \
+	src/StretcherChannelData.h \
+	src/float_cast/float_cast.h \
+	src/StretcherImpl.h \
+	src/StretchCalculator.h \
+	src/base/Profiler.h \
+	src/base/RingBuffer.h \
+	src/base/Scavenger.h \
+	src/dsp/AudioCurveCalculator.h \
+	src/audiocurves/CompoundAudioCurve.h \
+	src/audiocurves/ConstantAudioCurve.h \
+	src/audiocurves/HighFrequencyAudioCurve.h \
+	src/audiocurves/PercussiveAudioCurve.h \
+	src/audiocurves/SilentAudioCurve.h \
+	src/audiocurves/SpectralDifferenceAudioCurve.h \
+	src/dsp/Resampler.h \
+	src/dsp/FFT.h \
+	src/dsp/MovingMedian.h \
+	src/dsp/SincWindow.h \
+	src/dsp/Window.h \
+	src/system/Allocators.h \
+	src/system/Thread.h \
+	src/system/VectorOps.h \
+	src/system/VectorOpsComplex.h \
+	src/system/sysutils.h
+
+LIBRARY_SOURCES := \
+	src/rubberband-c.cpp \
+	src/RubberBandStretcher.cpp \
+	src/StretcherProcess.cpp \
+	src/StretchCalculator.cpp \
+	src/base/Profiler.cpp \
+	src/dsp/AudioCurveCalculator.cpp \
+	src/audiocurves/CompoundAudioCurve.cpp \
+	src/audiocurves/SpectralDifferenceAudioCurve.cpp \
+	src/audiocurves/HighFrequencyAudioCurve.cpp \
+	src/audiocurves/SilentAudioCurve.cpp \
+	src/audiocurves/ConstantAudioCurve.cpp \
+	src/audiocurves/PercussiveAudioCurve.cpp \
+	src/dsp/Resampler.cpp \
+	src/dsp/FFT.cpp \
+	src/system/Allocators.cpp \
+	src/system/sysutils.cpp \
+	src/system/Thread.cpp \
+	src/system/VectorOpsComplex.cpp \
+	src/StretcherChannelData.cpp \
+	src/StretcherImpl.cpp
+
+# For Speex resampler -- comment these lines out if not specifying USE_SPEEX
+LIBRARY_INCLUDES := $(LIBRARY_INCLUDES) \
+	src/speex/speex_resampler.h
+LIBRARY_SOURCES := $(LIBRARY_SOURCES) \
+	src/speex/resample.c
+
+PROGRAM_SOURCES := \
+	main/main.cpp
+
+VAMP_HEADERS := \
+	vamp/RubberBandVampPlugin.h
+
+VAMP_SOURCES := \
+	vamp/RubberBandVampPlugin.cpp \
+	vamp/libmain.cpp
+
+LADSPA_HEADERS := \
+	ladspa/RubberBandPitchShifter.h
+
+LADSPA_SOURCES := \
+	ladspa/RubberBandPitchShifter.cpp \
+	ladspa/libmain.cpp
+
+LIBRARY_OBJECTS := $(LIBRARY_SOURCES:.cpp=.o)
+LIBRARY_OBJECTS := $(LIBRARY_OBJECTS:.c=.o)
+
+PROGRAM_OBJECTS := $(PROGRAM_SOURCES:.cpp=.o)
+VAMP_OBJECTS    := $(VAMP_SOURCES:.cpp=.o)
+LADSPA_OBJECTS  := $(LADSPA_SOURCES:.cpp=.o)
+
+$(PROGRAM_TARGET):	$(LIBRARY_OBJECTS) $(PROGRAM_OBJECTS)
+	$(CXX) -o $@ $^ $(PROGRAM_LIBS) $(PROGRAM_LIBS) $(LDFLAGS)
+
+$(STATIC_TARGET):	$(LIBRARY_OBJECTS)
+	$(AR) rc $@ $^
+
+$(DYNAMIC_TARGET):	$(LIBRARY_OBJECTS)
+	$(CXX) $(DYNAMIC_LDFLAGS) $^ -o $@ $(LIBRARY_LIBS) $(LDFLAGS)
+
+$(VAMP_TARGET):		$(LIBRARY_OBJECTS) $(VAMP_OBJECTS)
+	$(CXX) $(VAMP_LDFLAGS) -o $@ $^ $(VAMP_PLUGIN_LIBS) $(LDFLAGS)
+
+$(LADSPA_TARGET):	$(LIBRARY_OBJECTS) $(LADSPA_OBJECTS)
+	$(CXX) $(LADSPA_LDFLAGS) -o $@ $^ $(LADSPA_PLUGIN_LIBS) $(LDFLAGS)
+
+bin:
+	$(MKDIR) $@
+lib:
+	$(MKDIR) $@
+
+clean:
+	rm -f $(LIBRARY_OBJECTS) $(PROGRAM_OBJECTS) $(LADSPA_OBJECTS) $(VAMP_OBJECTS)
+
+distclean:	clean
+	rm -f $(PROGRAM_TARGET) $(STATIC_TARGET) $(DYNAMIC_TARGET) $(VAMP_TARGET) $(LADSPA_TARGET)
+
+depend:
+	makedepend -Y $(LIBRARY_SOURCES) $(PROGRAM_SOURCES)
+
+
+# DO NOT DELETE
+
+src/rubberband-c.o: rubberband/rubberband-c.h
+src/rubberband-c.o: rubberband/RubberBandStretcher.h
+src/RubberBandStretcher.o: src/StretcherImpl.h
+src/RubberBandStretcher.o: rubberband/RubberBandStretcher.h src/dsp/Window.h
+src/RubberBandStretcher.o: src/dsp/FFT.h src/base/RingBuffer.h
+src/RubberBandStretcher.o: src/base/Scavenger.h src/system/Thread.h
+src/RubberBandStretcher.o: src/system/Thread.h src/system/sysutils.h
+src/StretcherProcess.o: src/StretcherImpl.h rubberband/RubberBandStretcher.h
+src/StretcherProcess.o: src/dsp/Window.h src/dsp/FFT.h src/base/RingBuffer.h
+src/StretcherProcess.o: src/base/Scavenger.h src/system/Thread.h
+src/StretcherProcess.o: src/system/Thread.h src/system/sysutils.h
+src/StretcherProcess.o: src/audiocurves/PercussiveAudioCurve.h
+src/StretcherProcess.o: src/dsp/AudioCurveCalculator.h
+src/StretcherProcess.o: src/audiocurves/HighFrequencyAudioCurve.h
+src/StretcherProcess.o: src/audiocurves/ConstantAudioCurve.h src/StretchCalculator.h
+src/StretcherProcess.o: src/StretcherChannelData.h src/dsp/Resampler.h
+src/StretcherProcess.o: src/base/Profiler.h src/system/VectorOps.h
+src/StretcherProcess.o: src/system/sysutils.h
+src/StretchCalculator.o: src/StretchCalculator.h src/system/sysutils.h
+src/system/Thread.o: src/system/Thread.h
+src/base/Profiler.o: src/base/Profiler.h src/system/sysutils.h
+src/dsp/AudioCurveCalculator.o: src/dsp/AudioCurveCalculator.h
+src/dsp/AudioCurveCalculator.o: src/system/sysutils.h
+src/audiocurves/SpectralDifferenceAudioCurve.o: src/audiocurves/SpectralDifferenceAudioCurve.h
+src/audiocurves/SpectralDifferenceAudioCurve.o: src/dsp/AudioCurveCalculator.h
+src/audiocurves/SpectralDifferenceAudioCurve.o: src/system/sysutils.h
+src/audiocurves/SpectralDifferenceAudioCurve.o: src/dsp/Window.h
+src/audiocurves/SpectralDifferenceAudioCurve.o: src/system/VectorOps.h
+src/audiocurves/SpectralDifferenceAudioCurve.o: src/system/sysutils.h
+src/audiocurves/HighFrequencyAudioCurve.o: src/audiocurves/HighFrequencyAudioCurve.h
+src/audiocurves/HighFrequencyAudioCurve.o: src/dsp/AudioCurveCalculator.h
+src/audiocurves/HighFrequencyAudioCurve.o: src/system/sysutils.h
+src/audiocurves/SilentAudioCurve.o: src/audiocurves/SilentAudioCurve.h
+src/audiocurves/SilentAudioCurve.o: src/dsp/AudioCurveCalculator.h
+src/audiocurves/SilentAudioCurve.o: src/system/sysutils.h
+src/audiocurves/ConstantAudioCurve.o: src/audiocurves/ConstantAudioCurve.h
+src/audiocurves/ConstantAudioCurve.o: src/dsp/AudioCurveCalculator.h
+src/audiocurves/ConstantAudioCurve.o: src/system/sysutils.h
+src/audiocurves/PercussiveAudioCurve.o: src/audiocurves/PercussiveAudioCurve.h
+src/audiocurves/PercussiveAudioCurve.o: src/dsp/AudioCurveCalculator.h
+src/audiocurves/PercussiveAudioCurve.o: src/system/sysutils.h src/system/VectorOps.h
+src/audiocurves/PercussiveAudioCurve.o: src/system/sysutils.h
+src/dsp/Resampler.o: src/dsp/Resampler.h src/system/sysutils.h
+src/dsp/Resampler.o: src/base/Profiler.h
+src/dsp/FFT.o: src/dsp/FFT.h src/system/sysutils.h src/system/Thread.h
+src/dsp/FFT.o: src/base/Profiler.h src/system/VectorOps.h
+src/dsp/FFT.o: src/system/sysutils.h
+src/system/Allocators.o: src/system/Allocators.h src/system/VectorOps.h
+src/system/Allocators.o: src/system/sysutils.h
+src/system/sysutils.o: src/system/sysutils.h
+src/StretcherChannelData.o: src/StretcherChannelData.h src/StretcherImpl.h
+src/StretcherChannelData.o: rubberband/RubberBandStretcher.h src/dsp/Window.h
+src/StretcherChannelData.o: src/dsp/FFT.h src/base/RingBuffer.h
+src/StretcherChannelData.o: src/base/Scavenger.h src/system/Thread.h
+src/StretcherChannelData.o: src/system/Thread.h src/system/sysutils.h
+src/StretcherChannelData.o: src/dsp/Resampler.h src/system/Allocators.h
+src/StretcherChannelData.o: src/system/VectorOps.h src/system/sysutils.h
+src/StretcherImpl.o: src/StretcherImpl.h rubberband/RubberBandStretcher.h
+src/StretcherImpl.o: src/dsp/Window.h src/dsp/FFT.h src/base/RingBuffer.h
+src/StretcherImpl.o: src/base/Scavenger.h src/system/Thread.h src/system/Thread.h
+src/StretcherImpl.o: src/system/sysutils.h src/audiocurves/PercussiveAudioCurve.h
+src/StretcherImpl.o: src/dsp/AudioCurveCalculator.h
+src/StretcherImpl.o: src/audiocurves/HighFrequencyAudioCurve.h
+src/StretcherImpl.o: src/audiocurves/SpectralDifferenceAudioCurve.h src/dsp/Window.h
+src/StretcherImpl.o: src/system/VectorOps.h src/system/sysutils.h
+src/StretcherImpl.o: src/audiocurves/SilentAudioCurve.h src/audiocurves/ConstantAudioCurve.h
+src/StretcherImpl.o: src/dsp/Resampler.h src/StretchCalculator.h
+src/StretcherImpl.o: src/StretcherChannelData.h src/base/Profiler.h
+main/main.o: rubberband/RubberBandStretcher.h src/system/sysutils.h
+main/main.o: src/base/Profiler.h
diff --git a/README.txt b/README.txt
index 26d6fc6..7901b7a 100644
--- a/README.txt
+++ b/README.txt
@@ -4,155 +4,86 @@ Rubber Band
 
 An audio time-stretching and pitch-shifting library and utility program.
 
-Copyright 2007-2011 Chris Cannam, chris.cannam@breakfastquay.com.
+Written by Chris Cannam, chris.cannam@breakfastquay.com.
+Copyright 2007-2012 Particular Programs Ltd.
 
-Distributed under the GNU General Public License.
+Rubber Band is a library and utility program that permits changing the
+tempo and pitch of an audio recording independently of one another.
+
+See http://breakfastquay.com/rubberband/ for more information.
 
 
-Contents
-========
+Licence
+=======
 
-1. About Rubber Band
-    - Attractive features
-    - Limitations
+Rubber Band is distributed under the GNU General Public License. See
+the file COPYING for more information.
 
-2. Compiling Rubber Band
+If you wish to distribute code using the Rubber Band Library under
+terms other than those of the GNU General Public License, you must
+obtain a commercial licence from us before doing so. In particular,
+you may not legally distribute through any Apple App Store unless you
+have a commercial licence.  See http://breakfastquay.com/rubberband/
+for licence terms.
 
-3. Using the Rubber Band utility
+If you have obtained a valid commercial licence, your licence
+supersedes this README and the enclosed COPYING file and you may
+redistribute and/or modify Rubber Band under the terms described in
+that licence. Please refer to your licence agreement for more details.
 
-4. Using the Rubber Band library
+Note that Rubber Band may link with other GPL libraries or with
+proprietary libraries, depending on its build configuration. See the
+section "FFT and resampler selection" below. It is your responsibility
+to ensure that you redistribute only in accordance with the licence
+terms of any other libraries you may build with.
 
 
+Contents of this README
+-----------------------
 
-About Rubber Band
------------------
-
-Rubber Band is a library and utility program that permits you to
-change the tempo and pitch of an audio recording independently of one
-another.
+1. Code components
+2. Using the Rubber Band command-line tool
+3. Using the Rubber Band Library
+4. Compiling Rubber Band
+   a. FFT and resampler selection
+   b. Other supported #defines
+   c. GNU/POSIX systems and Makefiles
+   d. OS/X and iOS
+   e. Win32 and Visual Studio
+   f. Android and Java
+5. Copyright notes for bundled libraries
 
 
-Attractive features
-~~~~~~~~~~~~~~~~~~~
+1. Code components
+------------------
 
-  * High quality results suitable for musical use
+Rubber Band consists of:
 
-    Rubber Band is a phase-vocoder-based frequency domain time
-    stretcher with phase resynchronisation at noisy transients and a
-    phase lamination technique to reduce phasiness.  It is suitable for
-    most musical uses with its default settings, and has a range of
-    options for fine tuning.
+ * The Rubber Band library code.  This is the code that will normally
+   be used by your applications.  The headers for this are in the
+   rubberband/ directory, and the source code is in src/.
+   The Rubber Band library depends upon resampler and FFT code; see
+   section 3a below for details.
 
-  * Real-time capable
+ * The Rubber Band command-line tool.  This is in main/main.cpp.
+   This program uses the Rubber Band library and also requires libsndfile
+   (http://www.mega-nerd.com/libsndfile/, licensed under the GNU Lesser
+   General Public License) for audio file loading.
 
-    In addition to the offline mode (for use in situations where all
-    audio data is available beforehand), Rubber Band supports a true
-    real-time, lock-free streaming mode, in which the time and pitch
-    scaling ratios may be dynamically adjusted during use.
+ * A pitch-shifter LADSPA audio effects plugin.  This is in ladspa/.
+   It requires the LADSPA SDK header ladspa.h (not included).
 
-  * Sample-accurate duration adjustment
-
-    In offline mode, Rubber Band ensures that the output has exactly
-    the right number of samples for the given stretch ratio.  (In
-    real-time mode Rubber Band aims to keep as closely as possible to
-    the exact ratio, although this depends on the audio material
-    itself.)
-
-  * Multiprocessor/multi-core support
-
-    Rubber Band's offline mode can take advantage of more than one
-    processor core if available, when processing data with two or more
-    audio channels.
-
-  * No job too big, or too small
-
-    Rubber Band is tuned so as to work well with the default settings
-    for any stretch ratio, from tiny deviations from the original
-    speed to very extreme stretches.
-
-  * Handy utilities included
-
-    The Rubber Band code includes a useful command-line time-stretch
-    and pitch shift utility (called simply rubberband), two LADSPA
-    pitch shifter plugins (Rubber Band Mono Pitch Shifter and Rubber
-    Band Stereo Pitch Shifter), and a Vamp audio analysis plugin which
-    may be used to inspect the stretch profile decisions Rubber Band
-    is taking.
-
-  * Free Software
-
-    Rubber Band is Free Software published under the GNU General
-    Public License.
+ * A Vamp audio analysis plugin which may be used to inspect the
+   dynamic stretch ratios and other decisions taken by the Rubber Band
+   library when in use.  This is in vamp/.  It requires the Vamp
+   plugin SDK (http://www.vamp-plugins.org/develop.html) (not included).
 
 
-Limitations
-~~~~~~~~~~~
+2. Using the Rubber Band command-line tool
+------------------------------------------
 
-  * Not especially fast
-
-    The algorithm used by Rubber Band is very processor intensive, and
-    Rubber Band is not the fastest implementation on earth.
-
-  * Not especially state of the art
-
-    Rubber Band employs well known algorithms which work well in many
-    situations, but it isn't "cutting edge" in any interesting sense.
-
-  * Relatively complex
-
-    While the fundamental algorithms in Rubber Band are not especially
-    complex, the implementation is complicated by the support for
-    multiple processing modes, exact sample precision, threading, and
-    other features that add to the flexibility of the API.
-
-
-Compiling Rubber Band
----------------------
-
-Rubber Band Library is supplied with a configure script for Linux and
-other systems with pkg-config, and a separate Makefile for basic OS/X
-builds without pkg-config.  It's also possible to build the Rubber
-Band Library GPL edition for Windows using MinGW, though you'll have
-to hack your own Makefile for that.
-
-
-Using configure
-~~~~~~~~~~~~~~~
-
-To build Rubber Band you will also need libsndfile, libsamplerate,
-FFTW3, the Vamp plugin SDK, the LADSPA plugin header, the pthread
-library (except on Win32), and a C++ compiler.  The code has been
-tested with GCC 4.x and with the Intel C++ compiler.
-
-Rubber Band comes with a simple autoconf script.  Run 
-
-  $ ./configure
-  $ make
-
-to compile, and optionally
-
-  # make install
-
-to install.
-
-
-Simple build for OS/X
-~~~~~~~~~~~~~~~~~~~~~
-
-To build just the library (but not the command-line utility, Vamp
-plugin or LADSPA plugin) for OS/X, run
-
-  $ make -f build/Makefile.osx
-
-You will need libsamplerate and libfftw3 installed, but no other
-non-system dependencies.
-
-
-Using the Rubber Band utility
------------------------------
-
-The Rubber Band command-line utility builds as bin/rubberband.  The
-basic incantation is
+The Rubber Band command-line tool builds as bin/rubberband.  The basic
+incantation is
 
   $ rubberband -t <timeratio> -p <pitchratio> <infile.wav> <outfile.wav>
 
@@ -168,8 +99,8 @@ In particular, different types of music may benefit from different
 "crispness" options (-c <n> where <n> is from 0 to 6).
 
 
-Using the Rubber Band library
------------------------------
+3. Using the Rubber Band library
+--------------------------------
 
 The Rubber Band library has a public API that consists of one C++
 class, called RubberBandStretcher in the RubberBand namespace.  You
@@ -189,11 +120,353 @@ pitch shifter plugin (ladspa/RubberBandPitchShifter.cpp) may be used
 as an example of Rubber Band in real-time mode.
 
 IMPORTANT: Please ensure you have read and understood the licensing
-terms for Rubber Band before using it in another application.  This
+terms for Rubber Band before using it in your application.  This
 library is provided under the GNU General Public License, which means
 that any application that uses it must also be published under the GPL
-or a compatible license (i.e. with its full source code also available
-for modification and redistribution).  See the file COPYING for more
-details.  Alternative commercial and proprietary licensing terms are
-available; please contact the author if you are interested.
+or a compatible licence (i.e. with its full source code also available
+for modification and redistribution) unless you have separately
+acquired a commercial licence from the author.
 
+
+4. Compiling Rubber Band
+------------------------
+
+4a. FFT and resampler selection
+-------------------------------
+
+Rubber Band requires additional library code for FFT calculation and
+resampling.  Several libraries are supported.  The selection is
+controlled using preprocessor flags at compile time, as detailed in
+the tables below.
+
+Flags that declare that you want to use an external library begin with
+HAVE_; flags that select from the bundled options begin with USE_.
+
+You must enable one resampler implementation and one FFT
+implementation.  Do not enable more than one of either unless you know
+what you're doing.
+
+If you are building this software using one of the bundled library
+options (Speex or KissFFT), please be sure to review the terms for
+those libraries in src/speex/COPYING and src/kissfft/COPYING as
+applicable.
+
+FFT libraries supported
+-----------------------
+
+Name           Flags required        Notes
+----           --------------        -----   
+
+FFTW3	       -DHAVE_FFTW	     GPL.
+
+Accelerate     -DHAVE_VDSP	     Platform library on OS/X and iOS.
+
+Intel IPP      -DHAVE_IPP            Proprietary library, can only be used with
+      	    			     Rubber Band commercial licence. Define
+				     USE_IPP_STATIC as well to build with static
+				     IPP libraries.
+
+KissFFT        -DUSE_KISSFFT	     Bundled, can be used with GPL or commercial
+	    			     licence.  Single-precision. Slower than the
+				     above options.
+
+Resampler libraries supported
+-----------------------------
+
+Name           Flags required        Notes
+----           --------------        -----   
+
+libsamplerate  -DHAVE_LIBSAMPLERATE  GPL.
+
+libresample    -DHAVE_LIBRESAMPLE    LGPL.
+
+Speex	       -DUSE_SPEEX	     Bundled, can be used with GPL or commercial
+	       			     licence.
+
+
+4b. Other supported #defines
+----------------------------
+
+Other symbols you may define at compile time are as follows. (Usually
+the supplied build files will handle these for you.)
+
+   -DLACK_BAD_ALLOC
+   Define on systems lacking std::bad_alloc in the C++ library.
+
+   -DLACK_POSIX_MEMALIGN
+   Define on systems lacking posix_memalign.
+
+   -DUSE_OWN_ALIGNED_MALLOC
+   Define on systems lacking any aligned malloc implementation.
+
+   -DLACK_SINCOS
+   Define on systems lacking sincos().
+   
+   -DNO_EXCEPTIONS
+   Build without use of C++ exceptions.
+
+   -DNO_THREADING
+   Build without any multithread support.
+
+   -DPROCESS_SAMPLE_TYPE=float
+   Select single precision for internal calculations. The default is
+   double precision. Consider using for mobile architectures with
+   slower double-precision support.
+
+   -DUSE_POMMIER_MATHFUN
+   Select the Julien Pommier implementations of trig functions for ARM
+   NEON or x86 SSE architectures. These are usually faster but may be
+   of lower precision than system implementations. Consider using this
+   for mobile architectures.
+
+
+4c. GNU/POSIX systems and Makefiles
+-----------------------------------
+
+A GNU-style configure script is included for use on Linux and similar
+systems.
+
+Run ./configure, then adjust the generated Makefile according to your
+preference for FFT and resampler implementations.  The default is to
+use FFTW3 and libsamplerate.
+
+The following Makefile targets are available:
+
+  static  -- build static libraries only
+  dynamic -- build dynamic libraries only
+  library -- build static and dynamic libraries only
+  program -- build the command-line tool
+  vamp    -- build Vamp plugin
+  ladspa  -- build LADSPA plugin
+  all     -- build everything.
+
+The default target is "all".
+
+
+4d. OS/X and iOS
+----------------
+
+A Makefile for OS/X is provided as Makefile.osx.
+
+Adjust the Makefile according to your preference for compiler and
+platform SDK, FFT and resampler implementations.  The default is to
+use the Accelerate framework and the Speex resampler.
+
+The following Makefile targets are available:
+
+  static  -- build static libraries only
+  dynamic -- build dynamic libraries only
+  library -- build static and dynamic libraries only
+  program -- build the command-line tool
+  vamp    -- build Vamp plugin
+  ladspa  -- build LADSPA plugin
+  all     -- build everything.
+
+The default target is to build the static and dynamic libraries and
+the command line tool.  The sndfile library is required for the
+command line tool.
+
+If you prefer to add the Rubber Band library files to an existing
+build project instead of using the Makefile, the files in src/ (except
+for RubberBandStretcherJNI.cpp) and the API headers in rubberband/
+should be all you need.
+
+Note that you cannot legally distribute applications using Rubber Band
+through the iPhone/iPad App Store or OS/X App Store unless you have a
+valid commercial licence.  GPL code is not permitted in these stores.
+
+
+4e. Win32 and Visual Studio
+---------------------------
+
+Two Visual Studio 2005 projects are supplied.
+
+rubberband-library.vcproj builds the Rubber Band static libraries
+only.
+
+rubberband-program.vcproj builds the Rubber Band command-line tool
+only (requires the Rubber Band libraries, and libsndfile).
+
+You will need to adjust the project settings so as to set the compile
+flags according to your preference for FFT and resampler
+implementation, and set the include path and library path
+appropriately.  The default is to use the bundled KissFFT and the
+Speex resampler.
+
+If you prefer to add the Rubber Band library files to an existing
+build project instead of using the supplied one, the files in src/
+(except for RubberBandStretcherJNI.cpp) and the API headers in
+rubberband/ should be all you need.
+
+
+4f. Android and Java
+--------------------
+
+An Android NDK build file is provided as Android.mk. This includes
+compile definitions for a shared library built for ARM architectures
+which can be loaded from a Java application using the Java native
+interface (i.e. the Android NDK).
+
+The Java side of the interface can be found in
+com/breakfastquay/rubberband/RubberBandStretcher.java.
+
+The supplied .mk file uses KissFFT and the Speex resampler.
+
+
+5. Copyright notes for bundled libraries
+========================================
+
+5a. Speex
+---------
+
+[files in src/speex]
+
+Copyright 2002-2007 	Xiph.org Foundation
+Copyright 2002-2007 	Jean-Marc Valin
+Copyright 2005-2007	Analog Devices Inc.
+Copyright 2005-2007	Commonwealth Scientific and Industrial Research 
+                        Organisation (CSIRO)
+Copyright 1993, 2002, 2006 David Rowe
+Copyright 2003 		EpicGames
+Copyright 1992-1994	Jutta Degener, Carsten Bormann
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+
+- Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+
+- Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in the
+documentation and/or other materials provided with the distribution.
+
+- Neither the name of the Xiph.org Foundation nor the names of its
+contributors may be used to endorse or promote products derived from
+this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE FOUNDATION OR
+CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+5b. KissFFT
+-----------
+
+[files in src/kissfft]
+
+Copyright (c) 2003-2004 Mark Borgerding
+
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are
+met:
+
+    * Redistributions of source code must retain the above copyright
+      notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above
+      copyright notice, this list of conditions and the following
+      disclaimer in the documentation and/or other materials provided
+      with the distribution.
+    * Neither the author nor the names of any contributors may be used
+      to endorse or promote products derived from this software
+      without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+5c. Pommier math functions
+--------------------------
+
+[files in src/pommier]
+
+Copyright (C) 2011  Julien Pommier
+
+  This software is provided 'as-is', without any express or implied
+  warranty.  In no event will the authors be held liable for any damages
+  arising from the use of this software.
+
+  Permission is granted to anyone to use this software for any purpose,
+  including commercial applications, and to alter it and redistribute it
+  freely, subject to the following restrictions:
+
+  1. The origin of this software must not be misrepresented; you must not
+     claim that you wrote the original software. If you use this software
+     in a product, an acknowledgment in the product documentation would be
+     appreciated but is not required.
+  2. Altered source versions must be plainly marked as such, and must not be
+     misrepresented as being the original software.
+  3. This notice may not be removed or altered from any source distribution.
+
+
+5d. float_cast
+--------------
+
+[files in src/float_cast]
+
+Copyright (C) 2001 Erik de Castro Lopo <erikd AT mega-nerd DOT com>
+
+Permission to use, copy, modify, distribute, and sell this file for any 
+purpose is hereby granted without fee, provided that the above copyright 
+and this permission notice appear in all copies.  No representations are
+made about the suitability of this software for any purpose.  It is 
+provided "as is" without express or implied warranty.
+
+
+5d. getopt
+----------
+
+[files in src/getopt, used by command-line tool on some platforms]
+
+Copyright (c) 2000 The NetBSD Foundation, Inc.
+All rights reserved.
+
+This code is derived from software contributed to The NetBSD Foundation
+by Dieter Baron and Thomas Klausner.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+1. Redistributions of source code must retain the above copyright
+   notice, this list of conditions and the following disclaimer.
+2. Redistributions in binary form must reproduce the above copyright
+   notice, this list of conditions and the following disclaimer in the
+   documentation and/or other materials provided with the distribution.
+3. All advertising materials mentioning features or use of this software
+   must display the following acknowledgement:
+       This product includes software developed by the NetBSD
+       Foundation, Inc. and its contributors.
+4. Neither the name of The NetBSD Foundation nor the names of its
+   contributors may be used to endorse or promote products derived
+   from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
+``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
+BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
diff --git a/build/Makefile.osx b/build/Makefile.osx
deleted file mode 100644
index 93e0639..0000000
--- a/build/Makefile.osx
+++ /dev/null
@@ -1,191 +0,0 @@
-
-# OS/X-specific Makefile for the Rubber Band Library only.
-# Does not build the Vamp plugin, LADSPA plugin, or command-line utility.
-
-CXX		:= g++
-CXXFLAGS	:= -DHAVE_FFTW3 -DFFTW_DOUBLE_ONLY -DNO_THREAD_CHECKS -DNO_TIMING -DNDEBUG -O3 -arch i386 -msse -msse2 -ffast-math -ftree-vectorize -I../include -I/usr/local/include -Irubberband -I. -Isrc
-LDFLAGS		:= -arch i386 -L../lib -L/usr/local/lib
-
-# CXX		:= g++-4.0
-# CXXFLAGS	:= -DHAVE_FFTW3 -DFFTW_DOUBLE_ONLY -DNO_THREAD_CHECKS -DNO_TIMING -DNDEBUG -mmacosx-version-min=10.4 -isysroot /Developer/SDKs/MacOSX10.4u.sdk -O3 -arch i386 -msse -msse2 -ffast-math -ftree-vectorize -I../include -I/usr/local/include -Irubberband -I. -Isrc
-# LDFLAGS		:= -mmacosx-version-min=10.4 -isysroot /Developer/SDKs/MacOSX10.4u.sdk -arch i386 -L../lib -L/usr/local/lib
-
-LIBRARY_LIBS		:= -lsamplerate -lfftw3 -lpthread -lm
-
-MKDIR			:= mkdir
-AR			:= ar
-
-DYNAMIC_LDFLAGS 	:= -dynamiclib
-DYNAMIC_EXTENSION	:= .dylib
-
-STATIC_TARGET  		:= lib/librubberband.a
-DYNAMIC_TARGET 		:= lib/librubberband$(DYNAMIC_EXTENSION)
-
-INSTALL_BINDIR		:= /usr/local/bin
-INSTALL_INCDIR		:= /usr/local/include/rubberband
-INSTALL_LIBDIR		:= /usr/local/lib
-INSTALL_PKGDIR		:= /usr/local/lib/pkgconfig
-
-all: bin lib $(STATIC_TARGET) $(DYNAMIC_TARGET)
-
-PUBLIC_INCLUDES := \
-	rubberband/rubberband-c.h \
-	rubberband/RubberBandStretcher.h
-
-LIBRARY_INCLUDES := \
-	src/StretcherChannelData.h \
-	src/float_cast/float_cast.h \
-	src/StretcherImpl.h \
-	src/StretchCalculator.h \
-	src/base/Profiler.h \
-	src/base/RingBuffer.h \
-	src/base/Scavenger.h \
-	src/dsp/AudioCurveCalculator.h \
-	src/dsp/CompoundAudioCurve.h \
-	src/dsp/ConstantAudioCurve.h \
-	src/dsp/HighFrequencyAudioCurve.h \
-	src/dsp/PercussiveAudioCurve.h \
-	src/dsp/SilentAudioCurve.h \
-	src/dsp/SpectralDifferenceAudioCurve.h \
-	src/dsp/Resampler.h \
-	src/dsp/FFT.h \
-	src/dsp/MovingMedian.h \
-	src/dsp/SincWindow.h \
-	src/dsp/Window.h \
-	src/system/Allocators.h \
-	src/system/Thread.h \
-	src/system/VectorOps.h \
-	src/system/sysutils.h
-
-LIBRARY_SOURCES := \
-	src/rubberband-c.cpp \
-	src/RubberBandStretcher.cpp \
-	src/StretcherProcess.cpp \
-	src/StretchCalculator.cpp \
-	src/base/Profiler.cpp \
-	src/dsp/AudioCurveCalculator.cpp \
-	src/dsp/CompoundAudioCurve.cpp \
-	src/dsp/SpectralDifferenceAudioCurve.cpp \
-	src/dsp/HighFrequencyAudioCurve.cpp \
-	src/dsp/SilentAudioCurve.cpp \
-	src/dsp/ConstantAudioCurve.cpp \
-	src/dsp/PercussiveAudioCurve.cpp \
-	src/dsp/Resampler.cpp \
-	src/dsp/FFT.cpp \
-	src/system/Allocators.cpp \
-	src/system/sysutils.cpp \
-	src/system/Thread.cpp \
-	src/StretcherChannelData.cpp \
-	src/StretcherImpl.cpp
-
-LIBRARY_OBJECTS := $(LIBRARY_SOURCES:.cpp=.o)
-
-$(STATIC_TARGET):	$(LIBRARY_OBJECTS)
-	rm -f $@
-	$(AR) rsc $@ $^
-
-$(DYNAMIC_TARGET):	$(LIBRARY_OBJECTS)
-	$(CXX) $(DYNAMIC_LDFLAGS) $^ -o $@ $(LIBRARY_LIBS) $(LDFLAGS)
-
-bin:
-	$(MKDIR) $@
-lib:
-	$(MKDIR) $@
-
-install:	all
-	$(MKDIR) -p $(DESTDIR)$(INSTALL_BINDIR)
-	$(MKDIR) -p $(DESTDIR)$(INSTALL_INCDIR)
-	$(MKDIR) -p $(DESTDIR)$(INSTALL_LIBDIR)
-	$(MKDIR) -p $(DESTDIR)$(INSTALL_PKGDIR)
-	cp $(PUBLIC_INCLUDES) $(DESTDIR)$(INSTALL_INCDIR)
-	cp $(STATIC_TARGET) $(DESTDIR)$(INSTALL_LIBDIR)
-	rm -f $(DESTDIR)$(INSTALL_LIBDIR)/$(DYNAMIC_LIBNAME)$(DYNAMIC_ABI_VERSION)
-	rm -f $(DESTDIR)$(INSTALL_LIBDIR)/$(DYNAMIC_LIBNAME)
-	cp $(DYNAMIC_TARGET) $(DESTDIR)$(INSTALL_LIBDIR)/$(DYNAMIC_LIBNAME)$(DYNAMIC_FULL_VERSION)
-	test -n "$(DYNAMIC_FULL_VERSION)" && ln -s $(DYNAMIC_LIBNAME)$(DYNAMIC_FULL_VERSION) $(DESTDIR)$(INSTALL_LIBDIR)/$(DYNAMIC_LIBNAME)$(DYNAMIC_ABI_VERSION)
-	test -n "$(DYNAMIC_FULL_VERSION)" && ln -s $(DYNAMIC_LIBNAME)$(DYNAMIC_FULL_VERSION) $(DESTDIR)$(INSTALL_LIBDIR)/$(DYNAMIC_LIBNAME)
-	sed "s,%PREFIX%,/usr/local," rubberband.pc.in \
-	  > $(DESTDIR)$(INSTALL_PKGDIR)/rubberband.pc
-
-clean:
-	rm -f $(LIBRARY_OBJECTS)
-
-distclean:	clean
-	rm -f $(STATIC_TARGET) $(DYNAMIC_TARGET)
-
-depend:
-	makedepend -Y $(LIBRARY_SOURCES)
-
-
-# DO NOT DELETE
-
-src/rubberband-c.o: rubberband/rubberband-c.h
-src/rubberband-c.o: rubberband/RubberBandStretcher.h
-src/RubberBandStretcher.o: src/StretcherImpl.h
-src/RubberBandStretcher.o: rubberband/RubberBandStretcher.h src/dsp/Window.h src/dsp/SincWindow.h
-src/RubberBandStretcher.o: src/dsp/FFT.h src/base/RingBuffer.h
-src/RubberBandStretcher.o: src/base/Scavenger.h src/system/Thread.h
-src/RubberBandStretcher.o: src/system/Thread.h src/system/sysutils.h
-src/StretcherProcess.o: src/StretcherImpl.h rubberband/RubberBandStretcher.h
-src/StretcherProcess.o: src/dsp/Window.h src/dsp/SincWindow.h src/dsp/FFT.h src/base/RingBuffer.h
-src/StretcherProcess.o: src/base/Scavenger.h src/system/Thread.h
-src/StretcherProcess.o: src/system/Thread.h src/system/sysutils.h
-src/StretcherProcess.o: src/dsp/PercussiveAudioCurve.h
-src/StretcherProcess.o: src/dsp/AudioCurveCalculator.h
-src/StretcherProcess.o: src/dsp/HighFrequencyAudioCurve.h
-src/StretcherProcess.o: src/dsp/ConstantAudioCurve.h src/StretchCalculator.h
-src/StretcherProcess.o: src/StretcherChannelData.h src/dsp/Resampler.h
-src/StretcherProcess.o: src/base/Profiler.h src/system/VectorOps.h
-src/StretcherProcess.o: src/system/sysutils.h
-src/StretchCalculator.o: src/StretchCalculator.h src/system/sysutils.h
-src/system/Thread.o: src/system/Thread.h
-src/base/Profiler.o: src/base/Profiler.h src/system/sysutils.h
-src/dsp/AudioCurveCalculator.o: src/dsp/AudioCurveCalculator.h
-src/dsp/AudioCurveCalculator.o: src/system/sysutils.h
-src/dsp/SpectralDifferenceAudioCurve.o: src/dsp/SpectralDifferenceAudioCurve.h
-src/dsp/SpectralDifferenceAudioCurve.o: src/dsp/AudioCurveCalculator.h
-src/dsp/SpectralDifferenceAudioCurve.o: src/system/sysutils.h
-src/dsp/SpectralDifferenceAudioCurve.o: src/dsp/Window.h src/dsp/SincWindow.h
-src/dsp/SpectralDifferenceAudioCurve.o: src/system/VectorOps.h
-src/dsp/SpectralDifferenceAudioCurve.o: src/system/sysutils.h
-src/dsp/HighFrequencyAudioCurve.o: src/dsp/HighFrequencyAudioCurve.h
-src/dsp/HighFrequencyAudioCurve.o: src/dsp/AudioCurveCalculator.h
-src/dsp/HighFrequencyAudioCurve.o: src/system/sysutils.h
-src/dsp/SilentAudioCurve.o: src/dsp/SilentAudioCurve.h
-src/dsp/SilentAudioCurve.o: src/dsp/AudioCurveCalculator.h
-src/dsp/SilentAudioCurve.o: src/system/sysutils.h
-src/dsp/ConstantAudioCurve.o: src/dsp/ConstantAudioCurve.h
-src/dsp/ConstantAudioCurve.o: src/dsp/AudioCurveCalculator.h
-src/dsp/ConstantAudioCurve.o: src/system/sysutils.h
-src/dsp/PercussiveAudioCurve.o: src/dsp/PercussiveAudioCurve.h
-src/dsp/PercussiveAudioCurve.o: src/dsp/AudioCurveCalculator.h
-src/dsp/PercussiveAudioCurve.o: src/system/sysutils.h src/system/VectorOps.h
-src/dsp/PercussiveAudioCurve.o: src/system/sysutils.h
-src/dsp/Resampler.o: src/dsp/Resampler.h src/system/sysutils.h
-src/dsp/Resampler.o: src/base/Profiler.h
-src/dsp/FFT.o: src/dsp/FFT.h src/system/sysutils.h src/system/Thread.h
-src/dsp/FFT.o: src/base/Profiler.h src/system/VectorOps.h
-src/dsp/FFT.o: src/system/sysutils.h
-src/system/Allocators.o: src/system/Allocators.h src/system/VectorOps.h
-src/system/Allocators.o: src/system/sysutils.h
-src/system/sysutils.o: src/system/sysutils.h
-src/StretcherChannelData.o: src/StretcherChannelData.h src/StretcherImpl.h
-src/StretcherChannelData.o: rubberband/RubberBandStretcher.h src/dsp/Window.h src/dsp/SincWindow.h
-src/StretcherChannelData.o: src/dsp/FFT.h src/base/RingBuffer.h
-src/StretcherChannelData.o: src/base/Scavenger.h src/system/Thread.h
-src/StretcherChannelData.o: src/system/Thread.h src/system/sysutils.h
-src/StretcherChannelData.o: src/dsp/Resampler.h src/system/Allocators.h
-src/StretcherChannelData.o: src/system/VectorOps.h src/system/sysutils.h
-src/StretcherImpl.o: src/StretcherImpl.h rubberband/RubberBandStretcher.h
-src/StretcherImpl.o: src/dsp/Window.h src/dsp/SincWindow.h src/dsp/FFT.h src/base/RingBuffer.h
-src/StretcherImpl.o: src/base/Scavenger.h src/system/Thread.h src/system/Thread.h
-src/StretcherImpl.o: src/system/sysutils.h src/dsp/PercussiveAudioCurve.h
-src/StretcherImpl.o: src/dsp/AudioCurveCalculator.h
-src/StretcherImpl.o: src/dsp/HighFrequencyAudioCurve.h
-src/StretcherImpl.o: src/dsp/SpectralDifferenceAudioCurve.h src/dsp/Window.h src/dsp/SincWindow.h
-src/StretcherImpl.o: src/system/VectorOps.h src/system/sysutils.h
-src/StretcherImpl.o: src/dsp/SilentAudioCurve.h src/dsp/ConstantAudioCurve.h
-src/StretcherImpl.o: src/dsp/Resampler.h src/StretchCalculator.h
-src/StretcherImpl.o: src/StretcherChannelData.h src/base/Profiler.h
-main/main.o: rubberband/RubberBandStretcher.h src/system/sysutils.h
-main/main.o: src/base/Profiler.h
diff --git a/com/breakfastquay/rubberband/RubberBandStretcher.java b/com/breakfastquay/rubberband/RubberBandStretcher.java
new file mode 100644
index 0000000..deaab29
--- /dev/null
+++ b/com/breakfastquay/rubberband/RubberBandStretcher.java
@@ -0,0 +1,96 @@
+/* Copyright Chris Cannam - All Rights Reserved */
+
+package com.breakfastquay.rubberband;
+
+public class RubberBandStretcher
+{
+    public RubberBandStretcher(int sampleRate, int channels,
+			       int options,
+			       double initialTimeRatio,
+			       double initialPitchScale) {
+	handle = 0;
+	initialise(sampleRate, channels, options,
+		   initialTimeRatio, initialPitchScale);
+    }
+
+    public native void dispose();
+
+    public native void reset();
+
+    public native void setTimeRatio(double ratio);
+    public native void setPitchScale(double scale);
+
+    public native int getChannelCount();
+    public native double getTimeRatio();
+    public native double getPitchScale();
+
+    public native int getLatency();
+
+    public native void setTransientsOption(int options);
+    public native void setDetectorOption(int options);
+    public native void setPhaseOption(int options);
+    public native void setFormantOption(int options);
+    public native void setPitchOption(int options);
+
+    public native void setExpectedInputDuration(long samples);
+    public native void setMaxProcessSize(int samples);
+
+    public native int getSamplesRequired();
+
+    //!!! setKeyFrameMap
+
+    //!!! we should check, for example, that the samples arrays have the right number of channels
+    //!!! extracting subset of array in java is a pain, this should take array + offset and there should be an interleaved alternative
+    public native void study(float[][] input, boolean finalBlock);
+    public native void process(float[][] input, boolean finalBlock);
+
+    public native int available();
+    public native int retrieve(float[][] output);
+
+    private native void initialise(int sampleRate, int channels, int options,
+				   double initialTimeRatio,
+				   double initialPitchScale);
+    private long handle;
+
+    public static final int OptionProcessOffline       = 0x00000000;
+    public static final int OptionProcessRealTime      = 0x00000001;
+
+    public static final int OptionStretchElastic       = 0x00000000;
+    public static final int OptionStretchPrecise       = 0x00000010;
+    
+    public static final int OptionTransientsCrisp      = 0x00000000;
+    public static final int OptionTransientsMixed      = 0x00000100;
+    public static final int OptionTransientsSmooth     = 0x00000200;
+
+    public static final int OptionDetectorCompound     = 0x00000000;
+    public static final int OptionDetectorPercussive   = 0x00000400;
+    public static final int OptionDetectorSoft         = 0x00000800;
+
+    public static final int OptionPhaseLaminar         = 0x00000000;
+    public static final int OptionPhaseIndependent     = 0x00002000;
+
+    public static final int OptionWindowStandard       = 0x00000000;
+    public static final int OptionWindowShort          = 0x00100000;
+    public static final int OptionWindowLong           = 0x00200000;
+
+    public static final int OptionSmoothingOff         = 0x00000000;
+    public static final int OptionSmoothingOn          = 0x00800000;
+
+    public static final int OptionFormantShifted       = 0x00000000;
+    public static final int OptionFormantPreserved     = 0x01000000;
+
+    public static final int OptionPitchHighSpeed       = 0x00000000;
+    public static final int OptionPitchHighQuality     = 0x02000000;
+    public static final int OptionPitchHighConsistency = 0x04000000;
+
+    public static final int OptionChannelsApart        = 0x00000000;
+    public static final int OptionChannelsTogether     = 0x10000000;
+
+    public static final int DefaultOptions             = 0x00000000;
+    public static final int PercussiveOptions          = 0x00102000;
+
+    static {
+	System.loadLibrary("rubberband");
+    }
+};
+
diff --git a/configure b/configure
index 1959c14..737a182 100755
--- a/configure
+++ b/configure
@@ -1,13 +1,11 @@
 #! /bin/sh
 # Guess values for system-dependent variables and create Makefiles.
-# Generated by GNU Autoconf 2.68 for RubberBand 1.6.
+# Generated by GNU Autoconf 2.69 for RubberBand 1.7.
 #
-# Report bugs to <cannam@all-day-breakfast.com>.
+# Report bugs to <chris.cannam@breakfastquay.com>.
 #
 #
-# Copyright (C) 1992, 1993, 1994, 1995, 1996, 1998, 1999, 2000, 2001,
-# 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 Free Software
-# Foundation, Inc.
+# Copyright (C) 1992-1996, 1998-2012 Free Software Foundation, Inc.
 #
 #
 # This configure script is free software; the Free Software Foundation
@@ -136,6 +134,31 @@ export LANGUAGE
 # CDPATH.
 (unset CDPATH) >/dev/null 2>&1 && unset CDPATH
 
+# Use a proper internal environment variable to ensure we don't fall
+  # into an infinite loop, continuously re-executing ourselves.
+  if test x"${_as_can_reexec}" != xno && test "x$CONFIG_SHELL" != x; then
+    _as_can_reexec=no; export _as_can_reexec;
+    # We cannot yet assume a decent shell, so we have to provide a
+# neutralization value for shells without unset; and this also
+# works around shells that cannot unset nonexistent variables.
+# Preserve -v and -x to the replacement shell.
+BASH_ENV=/dev/null
+ENV=/dev/null
+(unset BASH_ENV) >/dev/null 2>&1 && unset BASH_ENV ENV
+case $- in # ((((
+  *v*x* | *x*v* ) as_opts=-vx ;;
+  *v* ) as_opts=-v ;;
+  *x* ) as_opts=-x ;;
+  * ) as_opts= ;;
+esac
+exec $CONFIG_SHELL $as_opts "$as_myself" ${1+"$@"}
+# Admittedly, this is quite paranoid, since all the known shells bail
+# out after a failed `exec'.
+$as_echo "$0: could not re-execute with $CONFIG_SHELL" >&2
+as_fn_exit 255
+  fi
+  # We don't want this to propagate to other subprocesses.
+          { _as_can_reexec=; unset _as_can_reexec;}
 if test "x$CONFIG_SHELL" = x; then
   as_bourne_compatible="if test -n \"\${ZSH_VERSION+set}\" && (emulate sh) >/dev/null 2>&1; then :
   emulate sh
@@ -169,7 +192,8 @@ if ( set x; as_fn_ret_success y && test x = \"\$1\" ); then :
 else
   exitcode=1; echo positional parameters were not saved.
 fi
-test x\$exitcode = x0 || exit 1"
+test x\$exitcode = x0 || exit 1
+test -x / || exit 1"
   as_suggested="  as_lineno_1=";as_suggested=$as_suggested$LINENO;as_suggested=$as_suggested" as_lineno_1a=\$LINENO
   as_lineno_2=";as_suggested=$as_suggested$LINENO;as_suggested=$as_suggested" as_lineno_2a=\$LINENO
   eval 'test \"x\$as_lineno_1'\$as_run'\" != \"x\$as_lineno_2'\$as_run'\" &&
@@ -214,21 +238,25 @@ IFS=$as_save_IFS
 
 
       if test "x$CONFIG_SHELL" != x; then :
-  # We cannot yet assume a decent shell, so we have to provide a
-	# neutralization value for shells without unset; and this also
-	# works around shells that cannot unset nonexistent variables.
-	# Preserve -v and -x to the replacement shell.
-	BASH_ENV=/dev/null
-	ENV=/dev/null
-	(unset BASH_ENV) >/dev/null 2>&1 && unset BASH_ENV ENV
-	export CONFIG_SHELL
-	case $- in # ((((
-	  *v*x* | *x*v* ) as_opts=-vx ;;
-	  *v* ) as_opts=-v ;;
-	  *x* ) as_opts=-x ;;
-	  * ) as_opts= ;;
-	esac
-	exec "$CONFIG_SHELL" $as_opts "$as_myself" ${1+"$@"}
+  export CONFIG_SHELL
+             # We cannot yet assume a decent shell, so we have to provide a
+# neutralization value for shells without unset; and this also
+# works around shells that cannot unset nonexistent variables.
+# Preserve -v and -x to the replacement shell.
+BASH_ENV=/dev/null
+ENV=/dev/null
+(unset BASH_ENV) >/dev/null 2>&1 && unset BASH_ENV ENV
+case $- in # ((((
+  *v*x* | *x*v* ) as_opts=-vx ;;
+  *v* ) as_opts=-v ;;
+  *x* ) as_opts=-x ;;
+  * ) as_opts= ;;
+esac
+exec $CONFIG_SHELL $as_opts "$as_myself" ${1+"$@"}
+# Admittedly, this is quite paranoid, since all the known shells bail
+# out after a failed `exec'.
+$as_echo "$0: could not re-execute with $CONFIG_SHELL" >&2
+exit 255
 fi
 
     if test x$as_have_required = xno; then :
@@ -239,7 +267,7 @@ fi
     $as_echo "$0: be upgraded to zsh 4.3.4 or later."
   else
     $as_echo "$0: Please tell bug-autoconf@gnu.org and
-$0: cannam@all-day-breakfast.com about your system,
+$0: chris.cannam@breakfastquay.com about your system,
 $0: including any error possibly output before this
 $0: message. Then install a modern shell, or manually run
 $0: the script under such a shell if you do have one."
@@ -331,6 +359,14 @@ $as_echo X"$as_dir" |
 
 
 } # as_fn_mkdir_p
+
+# as_fn_executable_p FILE
+# -----------------------
+# Test if FILE is an executable regular file.
+as_fn_executable_p ()
+{
+  test -f "$1" && test -x "$1"
+} # as_fn_executable_p
 # as_fn_append VAR VALUE
 # ----------------------
 # Append the text in VALUE to the end of the definition contained in VAR. Take
@@ -452,6 +488,10 @@ as_cr_alnum=$as_cr_Letters$as_cr_digits
   chmod +x "$as_me.lineno" ||
     { $as_echo "$as_me: error: cannot create $as_me.lineno; rerun with a POSIX shell" >&2; as_fn_exit 1; }
 
+  # If we had to re-execute with $CONFIG_SHELL, we're ensured to have
+  # already done that, so ensure we don't try to do so again and fall
+  # in an infinite loop.  This has already happened in practice.
+  _as_can_reexec=no; export _as_can_reexec
   # Don't try to exec as it changes $[0], causing all sort of problems
   # (the dirname of $[0] is not the place where we might find the
   # original and so on.  Autoconf is especially sensitive to this).
@@ -486,16 +526,16 @@ if (echo >conf$$.file) 2>/dev/null; then
     # ... but there are two gotchas:
     # 1) On MSYS, both `ln -s file dir' and `ln file dir' fail.
     # 2) DJGPP < 2.04 has no symlinks; `ln -s' creates a wrapper executable.
-    # In both cases, we have to default to `cp -p'.
+    # In both cases, we have to default to `cp -pR'.
     ln -s conf$$.file conf$$.dir 2>/dev/null && test ! -f conf$$.exe ||
-      as_ln_s='cp -p'
+      as_ln_s='cp -pR'
   elif ln conf$$.file conf$$ 2>/dev/null; then
     as_ln_s=ln
   else
-    as_ln_s='cp -p'
+    as_ln_s='cp -pR'
   fi
 else
-  as_ln_s='cp -p'
+  as_ln_s='cp -pR'
 fi
 rm -f conf$$ conf$$.exe conf$$.dir/conf$$.file conf$$.file
 rmdir conf$$.dir 2>/dev/null
@@ -507,28 +547,8 @@ else
   as_mkdir_p=false
 fi
 
-if test -x / >/dev/null 2>&1; then
-  as_test_x='test -x'
-else
-  if ls -dL / >/dev/null 2>&1; then
-    as_ls_L_option=L
-  else
-    as_ls_L_option=
-  fi
-  as_test_x='
-    eval sh -c '\''
-      if test -d "$1"; then
-	test -d "$1/.";
-      else
-	case $1 in #(
-	-*)set "./$1";;
-	esac;
-	case `ls -ld'$as_ls_L_option' "$1" 2>/dev/null` in #((
-	???[sx]*):;;*)false;;esac;fi
-    '\'' sh
-  '
-fi
-as_executable_p=$as_test_x
+as_test_x='test -x'
+as_executable_p=as_fn_executable_p
 
 # Sed expression to map a string onto a valid CPP name.
 as_tr_cpp="eval sed 'y%*$as_cr_letters%P$as_cr_LETTERS%;s%[^_$as_cr_alnum]%_%g'"
@@ -560,9 +580,9 @@ MAKEFLAGS=
 # Identity of this package.
 PACKAGE_NAME='RubberBand'
 PACKAGE_TARNAME='rubberband'
-PACKAGE_VERSION='1.6'
-PACKAGE_STRING='RubberBand 1.6'
-PACKAGE_BUGREPORT='cannam@all-day-breakfast.com'
+PACKAGE_VERSION='1.7'
+PACKAGE_STRING='RubberBand 1.7'
+PACKAGE_BUGREPORT='chris.cannam@breakfastquay.com'
 PACKAGE_URL=''
 
 ac_unique_file="src/StretcherImpl.h"
@@ -1148,8 +1168,6 @@ target=$target_alias
 if test "x$host_alias" != x; then
   if test "x$build_alias" = x; then
     cross_compiling=maybe
-    $as_echo "$as_me: WARNING: if you wanted to set the --build type, don't use --host.
-    If a cross compiler is detected then cross compile mode will be used" >&2
   elif test "x$build_alias" != "x$host_alias"; then
     cross_compiling=yes
   fi
@@ -1235,7 +1253,7 @@ if test "$ac_init_help" = "long"; then
   # Omit some internal or obsolete options to make the list less imposing.
   # This message is too long to be a string in the A/UX 3.1 sh.
   cat <<_ACEOF
-\`configure' configures RubberBand 1.6 to adapt to many kinds of systems.
+\`configure' configures RubberBand 1.7 to adapt to many kinds of systems.
 
 Usage: $0 [OPTION]... [VAR=VALUE]...
 
@@ -1296,7 +1314,7 @@ fi
 
 if test -n "$ac_init_help"; then
   case $ac_init_help in
-     short | recursive ) echo "Configuration of RubberBand 1.6:";;
+     short | recursive ) echo "Configuration of RubberBand 1.7:";;
    esac
   cat <<\_ACEOF
 
@@ -1330,7 +1348,7 @@ Some influential environment variables:
 Use these variables to override the choices made by `configure' or to help
 it to find libraries and programs with nonstandard names/locations.
 
-Report bugs to <cannam@all-day-breakfast.com>.
+Report bugs to <chris.cannam@breakfastquay.com>.
 _ACEOF
 ac_status=$?
 fi
@@ -1393,10 +1411,10 @@ fi
 test -n "$ac_init_help" && exit $ac_status
 if $ac_init_version; then
   cat <<\_ACEOF
-RubberBand configure 1.6
-generated by GNU Autoconf 2.68
+RubberBand configure 1.7
+generated by GNU Autoconf 2.69
 
-Copyright (C) 2010 Free Software Foundation, Inc.
+Copyright (C) 2012 Free Software Foundation, Inc.
 This configure script is free software; the Free Software Foundation
 gives unlimited permission to copy, distribute and modify it.
 _ACEOF
@@ -1663,9 +1681,9 @@ $as_echo "$as_me: WARNING: $2: see the Autoconf documentation" >&2;}
 $as_echo "$as_me: WARNING: $2:     section \"Present But Cannot Be Compiled\"" >&2;}
     { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $2: proceeding with the compiler's result" >&5
 $as_echo "$as_me: WARNING: $2: proceeding with the compiler's result" >&2;}
-( $as_echo "## ------------------------------------------- ##
-## Report this to cannam@all-day-breakfast.com ##
-## ------------------------------------------- ##"
+( $as_echo "## --------------------------------------------- ##
+## Report this to chris.cannam@breakfastquay.com ##
+## --------------------------------------------- ##"
      ) | sed "s/^/$as_me: WARNING:     /" >&2
     ;;
 esac
@@ -1687,8 +1705,8 @@ cat >config.log <<_ACEOF
 This file contains any messages produced by compilers while
 running configure, to aid debugging if configure makes a mistake.
 
-It was created by RubberBand $as_me 1.6, which was
-generated by GNU Autoconf 2.68.  Invocation command line was
+It was created by RubberBand $as_me 1.7, which was
+generated by GNU Autoconf 2.69.  Invocation command line was
 
   $ $0 $@
 
@@ -2065,7 +2083,7 @@ do
   IFS=$as_save_IFS
   test -z "$as_dir" && as_dir=.
     for ac_exec_ext in '' $ac_executable_extensions; do
-  if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
     ac_cv_prog_CXX="$ac_tool_prefix$ac_prog"
     $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
     break 2
@@ -2109,7 +2127,7 @@ do
   IFS=$as_save_IFS
   test -z "$as_dir" && as_dir=.
     for ac_exec_ext in '' $ac_executable_extensions; do
-  if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
     ac_cv_prog_ac_ct_CXX="$ac_prog"
     $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
     break 2
@@ -2566,7 +2584,7 @@ do
   IFS=$as_save_IFS
   test -z "$as_dir" && as_dir=.
     for ac_exec_ext in '' $ac_executable_extensions; do
-  if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
     ac_cv_prog_CC="${ac_tool_prefix}gcc"
     $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
     break 2
@@ -2606,7 +2624,7 @@ do
   IFS=$as_save_IFS
   test -z "$as_dir" && as_dir=.
     for ac_exec_ext in '' $ac_executable_extensions; do
-  if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
     ac_cv_prog_ac_ct_CC="gcc"
     $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
     break 2
@@ -2659,7 +2677,7 @@ do
   IFS=$as_save_IFS
   test -z "$as_dir" && as_dir=.
     for ac_exec_ext in '' $ac_executable_extensions; do
-  if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
     ac_cv_prog_CC="${ac_tool_prefix}cc"
     $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
     break 2
@@ -2700,7 +2718,7 @@ do
   IFS=$as_save_IFS
   test -z "$as_dir" && as_dir=.
     for ac_exec_ext in '' $ac_executable_extensions; do
-  if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
     if test "$as_dir/$ac_word$ac_exec_ext" = "/usr/ucb/cc"; then
        ac_prog_rejected=yes
        continue
@@ -2758,7 +2776,7 @@ do
   IFS=$as_save_IFS
   test -z "$as_dir" && as_dir=.
     for ac_exec_ext in '' $ac_executable_extensions; do
-  if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
     ac_cv_prog_CC="$ac_tool_prefix$ac_prog"
     $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
     break 2
@@ -2802,7 +2820,7 @@ do
   IFS=$as_save_IFS
   test -z "$as_dir" && as_dir=.
     for ac_exec_ext in '' $ac_executable_extensions; do
-  if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
     ac_cv_prog_ac_ct_CC="$ac_prog"
     $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
     break 2
@@ -2998,8 +3016,7 @@ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
 /* end confdefs.h.  */
 #include <stdarg.h>
 #include <stdio.h>
-#include <sys/types.h>
-#include <sys/stat.h>
+struct stat;
 /* Most of the following tests are stolen from RCS 5.7's src/conf.sh.  */
 struct buf { int x; };
 FILE * (*rcsopen) (struct buf *, struct stat *, int);
@@ -3239,7 +3256,7 @@ do
     for ac_prog in grep ggrep; do
     for ac_exec_ext in '' $ac_executable_extensions; do
       ac_path_GREP="$as_dir/$ac_prog$ac_exec_ext"
-      { test -f "$ac_path_GREP" && $as_test_x "$ac_path_GREP"; } || continue
+      as_fn_executable_p "$ac_path_GREP" || continue
 # Check for GNU ac_path_GREP and select it if it is found.
   # Check for GNU $ac_path_GREP
 case `"$ac_path_GREP" --version 2>&1` in
@@ -3305,7 +3322,7 @@ do
     for ac_prog in egrep; do
     for ac_exec_ext in '' $ac_executable_extensions; do
       ac_path_EGREP="$as_dir/$ac_prog$ac_exec_ext"
-      { test -f "$ac_path_EGREP" && $as_test_x "$ac_path_EGREP"; } || continue
+      as_fn_executable_p "$ac_path_EGREP" || continue
 # Check for GNU ac_path_EGREP and select it if it is found.
   # Check for GNU $ac_path_EGREP
 case `"$ac_path_EGREP" --version 2>&1` in
@@ -3713,6 +3730,7 @@ $as_echo "#define AC_APPLE_UNIVERSAL_BUILD 1" >>confdefs.h
 
 
 
+
 if test "x$ac_cv_env_PKG_CONFIG_set" != "xset"; then
 	if test -n "$ac_tool_prefix"; then
   # Extract the first word of "${ac_tool_prefix}pkg-config", so it can be a program name with args.
@@ -3733,7 +3751,7 @@ do
   IFS=$as_save_IFS
   test -z "$as_dir" && as_dir=.
     for ac_exec_ext in '' $ac_executable_extensions; do
-  if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
     ac_cv_path_PKG_CONFIG="$as_dir/$ac_word$ac_exec_ext"
     $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
     break 2
@@ -3776,7 +3794,7 @@ do
   IFS=$as_save_IFS
   test -z "$as_dir" && as_dir=.
     for ac_exec_ext in '' $ac_executable_extensions; do
-  if { test -f "$as_dir/$ac_word$ac_exec_ext" && $as_test_x "$as_dir/$ac_word$ac_exec_ext"; }; then
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
     ac_cv_path_ac_pt_PKG_CONFIG="$as_dir/$ac_word$ac_exec_ext"
     $as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
     break 2
@@ -3841,6 +3859,7 @@ if test -n "$SRC_CFLAGS"; then
   $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
   test $ac_status = 0; }; then
   pkg_cv_SRC_CFLAGS=`$PKG_CONFIG --cflags "samplerate" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
 else
   pkg_failed=yes
 fi
@@ -3857,6 +3876,7 @@ if test -n "$SRC_LIBS"; then
   $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
   test $ac_status = 0; }; then
   pkg_cv_SRC_LIBS=`$PKG_CONFIG --libs "samplerate" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
 else
   pkg_failed=yes
 fi
@@ -3876,9 +3896,9 @@ else
         _pkg_short_errors_supported=no
 fi
         if test $_pkg_short_errors_supported = yes; then
-	        SRC_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors "samplerate" 2>&1`
+	        SRC_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "samplerate" 2>&1`
         else
-	        SRC_PKG_ERRORS=`$PKG_CONFIG --print-errors "samplerate" 2>&1`
+	        SRC_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "samplerate" 2>&1`
         fi
 	# Put the nasty error message in config.log where it belongs
 	echo "$SRC_PKG_ERRORS" >&5
@@ -3933,6 +3953,7 @@ if test -n "$SNDFILE_CFLAGS"; then
   $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
   test $ac_status = 0; }; then
   pkg_cv_SNDFILE_CFLAGS=`$PKG_CONFIG --cflags "sndfile" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
 else
   pkg_failed=yes
 fi
@@ -3949,6 +3970,7 @@ if test -n "$SNDFILE_LIBS"; then
   $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
   test $ac_status = 0; }; then
   pkg_cv_SNDFILE_LIBS=`$PKG_CONFIG --libs "sndfile" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
 else
   pkg_failed=yes
 fi
@@ -3968,9 +3990,9 @@ else
         _pkg_short_errors_supported=no
 fi
         if test $_pkg_short_errors_supported = yes; then
-	        SNDFILE_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors "sndfile" 2>&1`
+	        SNDFILE_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "sndfile" 2>&1`
         else
-	        SNDFILE_PKG_ERRORS=`$PKG_CONFIG --print-errors "sndfile" 2>&1`
+	        SNDFILE_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "sndfile" 2>&1`
         fi
 	# Put the nasty error message in config.log where it belongs
 	echo "$SNDFILE_PKG_ERRORS" >&5
@@ -4025,6 +4047,7 @@ if test -n "$FFTW_CFLAGS"; then
   $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
   test $ac_status = 0; }; then
   pkg_cv_FFTW_CFLAGS=`$PKG_CONFIG --cflags "fftw3" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
 else
   pkg_failed=yes
 fi
@@ -4041,6 +4064,7 @@ if test -n "$FFTW_LIBS"; then
   $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
   test $ac_status = 0; }; then
   pkg_cv_FFTW_LIBS=`$PKG_CONFIG --libs "fftw3" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
 else
   pkg_failed=yes
 fi
@@ -4060,9 +4084,9 @@ else
         _pkg_short_errors_supported=no
 fi
         if test $_pkg_short_errors_supported = yes; then
-	        FFTW_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors "fftw3" 2>&1`
+	        FFTW_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "fftw3" 2>&1`
         else
-	        FFTW_PKG_ERRORS=`$PKG_CONFIG --print-errors "fftw3" 2>&1`
+	        FFTW_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "fftw3" 2>&1`
         fi
 	# Put the nasty error message in config.log where it belongs
 	echo "$FFTW_PKG_ERRORS" >&5
@@ -4142,6 +4166,7 @@ if test -n "$Vamp_CFLAGS"; then
   $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
   test $ac_status = 0; }; then
   pkg_cv_Vamp_CFLAGS=`$PKG_CONFIG --cflags "vamp-sdk" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
 else
   pkg_failed=yes
 fi
@@ -4158,6 +4183,7 @@ if test -n "$Vamp_LIBS"; then
   $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
   test $ac_status = 0; }; then
   pkg_cv_Vamp_LIBS=`$PKG_CONFIG --libs "vamp-sdk" 2>/dev/null`
+		      test "x$?" != "x0" && pkg_failed=yes
 else
   pkg_failed=yes
 fi
@@ -4177,9 +4203,9 @@ else
         _pkg_short_errors_supported=no
 fi
         if test $_pkg_short_errors_supported = yes; then
-	        Vamp_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors "vamp-sdk" 2>&1`
+	        Vamp_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "vamp-sdk" 2>&1`
         else
-	        Vamp_PKG_ERRORS=`$PKG_CONFIG --print-errors "vamp-sdk" 2>&1`
+	        Vamp_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "vamp-sdk" 2>&1`
         fi
 	# Put the nasty error message in config.log where it belongs
 	echo "$Vamp_PKG_ERRORS" >&5
@@ -4226,7 +4252,7 @@ if test "x$GCC" = "xyes"; then
   esac
   case " $CXXFLAGS " in
     *[\ \	]-fPIC\ -Wall[\ \	]*) ;;
-    *) CXXFLAGS="$CXXFLAGS -fPIC -Wall -Woverloaded-virtual" ;;
+    *) CXXFLAGS="$CXXFLAGS -fPIC -Wall" ;;
   esac
 fi
 
@@ -4676,16 +4702,16 @@ if (echo >conf$$.file) 2>/dev/null; then
     # ... but there are two gotchas:
     # 1) On MSYS, both `ln -s file dir' and `ln file dir' fail.
     # 2) DJGPP < 2.04 has no symlinks; `ln -s' creates a wrapper executable.
-    # In both cases, we have to default to `cp -p'.
+    # In both cases, we have to default to `cp -pR'.
     ln -s conf$$.file conf$$.dir 2>/dev/null && test ! -f conf$$.exe ||
-      as_ln_s='cp -p'
+      as_ln_s='cp -pR'
   elif ln conf$$.file conf$$ 2>/dev/null; then
     as_ln_s=ln
   else
-    as_ln_s='cp -p'
+    as_ln_s='cp -pR'
   fi
 else
-  as_ln_s='cp -p'
+  as_ln_s='cp -pR'
 fi
 rm -f conf$$ conf$$.exe conf$$.dir/conf$$.file conf$$.file
 rmdir conf$$.dir 2>/dev/null
@@ -4745,28 +4771,16 @@ else
   as_mkdir_p=false
 fi
 
-if test -x / >/dev/null 2>&1; then
-  as_test_x='test -x'
-else
-  if ls -dL / >/dev/null 2>&1; then
-    as_ls_L_option=L
-  else
-    as_ls_L_option=
-  fi
-  as_test_x='
-    eval sh -c '\''
-      if test -d "$1"; then
-	test -d "$1/.";
-      else
-	case $1 in #(
-	-*)set "./$1";;
-	esac;
-	case `ls -ld'$as_ls_L_option' "$1" 2>/dev/null` in #((
-	???[sx]*):;;*)false;;esac;fi
-    '\'' sh
-  '
-fi
-as_executable_p=$as_test_x
+
+# as_fn_executable_p FILE
+# -----------------------
+# Test if FILE is an executable regular file.
+as_fn_executable_p ()
+{
+  test -f "$1" && test -x "$1"
+} # as_fn_executable_p
+as_test_x='test -x'
+as_executable_p=as_fn_executable_p
 
 # Sed expression to map a string onto a valid CPP name.
 as_tr_cpp="eval sed 'y%*$as_cr_letters%P$as_cr_LETTERS%;s%[^_$as_cr_alnum]%_%g'"
@@ -4787,8 +4801,8 @@ cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1
 # report actual input values of CONFIG_FILES etc. instead of their
 # values after options handling.
 ac_log="
-This file was extended by RubberBand $as_me 1.6, which was
-generated by GNU Autoconf 2.68.  Invocation command line was
+This file was extended by RubberBand $as_me 1.7, which was
+generated by GNU Autoconf 2.69.  Invocation command line was
 
   CONFIG_FILES    = $CONFIG_FILES
   CONFIG_HEADERS  = $CONFIG_HEADERS
@@ -4834,17 +4848,17 @@ Usage: $0 [OPTION]... [TAG]...
 Configuration files:
 $config_files
 
-Report bugs to <cannam@all-day-breakfast.com>."
+Report bugs to <chris.cannam@breakfastquay.com>."
 
 _ACEOF
 cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1
 ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`"
 ac_cs_version="\\
-RubberBand config.status 1.6
-configured by $0, generated by GNU Autoconf 2.68,
+RubberBand config.status 1.7
+configured by $0, generated by GNU Autoconf 2.69,
   with options \\"\$ac_cs_config\\"
 
-Copyright (C) 2010 Free Software Foundation, Inc.
+Copyright (C) 2012 Free Software Foundation, Inc.
 This config.status script is free software; the Free Software Foundation
 gives unlimited permission to copy, distribute and modify it."
 
@@ -4921,7 +4935,7 @@ fi
 _ACEOF
 cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1
 if \$ac_cs_recheck; then
-  set X '$SHELL' '$0' $ac_configure_args \$ac_configure_extra_args --no-create --no-recursion
+  set X $SHELL '$0' $ac_configure_args \$ac_configure_extra_args --no-create --no-recursion
   shift
   \$as_echo "running CONFIG_SHELL=$SHELL \$*" >&6
   CONFIG_SHELL='$SHELL'
diff --git a/configure.ac b/configure.ac
index 1ff8688..675e9fa 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1,5 +1,5 @@
 
-AC_INIT(RubberBand, 1.7, cannam@all-day-breakfast.com)
+AC_INIT(RubberBand, 1.7, chris.cannam@breakfastquay.com)
 
 AC_CONFIG_SRCDIR(src/StretcherImpl.h)
 AC_PROG_CXX
@@ -33,7 +33,7 @@ if test "x$GCC" = "xyes"; then
   esac
   case " $CXXFLAGS " in
     *[\ \	]-fPIC\ -Wall[\ \	]*) ;;
-    *) CXXFLAGS="$CXXFLAGS -fPIC -Wall -Woverloaded-virtual" ;;
+    *) CXXFLAGS="$CXXFLAGS -fPIC -Wall" ;;
   esac
 fi
 changequote([,])dnl
diff --git a/ladspa/RubberBandPitchShifter.cpp b/ladspa/RubberBandPitchShifter.cpp
index a4859b1..6a4a17f 100644
--- a/ladspa/RubberBandPitchShifter.cpp
+++ b/ladspa/RubberBandPitchShifter.cpp
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include "RubberBandPitchShifter.h"
diff --git a/ladspa/RubberBandPitchShifter.h b/ladspa/RubberBandPitchShifter.h
index 22fb7dd..de28d59 100644
--- a/ladspa/RubberBandPitchShifter.h
+++ b/ladspa/RubberBandPitchShifter.h
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _RUBBERBAND_PITCH_SHIFTER_H_
diff --git a/ladspa/libmain.cpp b/ladspa/libmain.cpp
index 5a672ce..3ce8010 100644
--- a/ladspa/libmain.cpp
+++ b/ladspa/libmain.cpp
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include "RubberBandPitchShifter.h"
diff --git a/main/main.cpp b/main/main.cpp
index c977423..d51cff3 100644
--- a/main/main.cpp
+++ b/main/main.cpp
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include "rubberband/RubberBandStretcher.h"
@@ -26,9 +35,13 @@
 
 #include "system/sysutils.h"
 
+#ifdef __MSVC__
+#include "getopt/getopt.h"
+#else
 #include <getopt.h>
 #include <unistd.h>
 #include <sys/time.h>
+#endif
 
 #include "base/Profiler.h"
 
@@ -39,6 +52,9 @@ using namespace RubberBand;
 using RubberBand::gettimeofday;
 #endif
 
+#ifdef __MSVC__
+using RubberBand::usleep;
+#endif
 
 double tempo_convert(const char *str)
 {
@@ -183,7 +199,7 @@ int main(int argc, char **argv)
         cerr << endl;
 	cerr << "Rubber Band" << endl;
         cerr << "An audio time-stretching and pitch-shifting library and utility program." << endl;
-	cerr << "Copyright 2011 Chris Cannam.  Distributed under the GNU General Public License." << endl;
+	cerr << "Copyright 2007-2012 Particular Programs Ltd." << endl;
         cerr << endl;
 	cerr << "   Usage: " << argv[0] << " [options] <infile.wav> <outfile.wav>" << endl;
         cerr << endl;
diff --git a/rubberband-library.vcproj b/rubberband-library.vcproj
new file mode 100644
index 0000000..205d1e1
--- /dev/null
+++ b/rubberband-library.vcproj
@@ -0,0 +1,367 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<VisualStudioProject
+	ProjectType="Visual C++"
+	Version="9.00"
+	Name="rubberband-library"
+	ProjectGUID="{020CEB11-EF4E-400E-971D-A35DB69D7CF9}"
+	RootNamespace="rubberband-library"
+	Keyword="Win32Proj"
+	TargetFrameworkVersion="0"
+	>
+	<Platforms>
+		<Platform
+			Name="Win32"
+		/>
+	</Platforms>
+	<ToolFiles>
+	</ToolFiles>
+	<Configurations>
+		<Configuration
+			Name="Debug|Win32"
+			OutputDirectory="Debug"
+			IntermediateDirectory="Debug"
+			ConfigurationType="4"
+			EnableManagedIncrementalBuild="1"
+			>
+			<Tool
+				Name="VCPreBuildEventTool"
+			/>
+			<Tool
+				Name="VCCustomBuildTool"
+			/>
+			<Tool
+				Name="VCXMLDataGeneratorTool"
+			/>
+			<Tool
+				Name="VCWebServiceProxyGeneratorTool"
+			/>
+			<Tool
+				Name="VCMIDLTool"
+			/>
+			<Tool
+				Name="VCCLCompilerTool"
+				Optimization="0"
+				AdditionalIncludeDirectories=".;.\src;"
+				PreprocessorDefinitions="__MSVC__;WIN32;_DEBUG;_LIB;NOMINMAX;_USE_MATH_DEFINES;USE_KISSFFT;USE_SPEEX"
+				MinimalRebuild="true"
+				BasicRuntimeChecks="3"
+				RuntimeLibrary="3"
+				UsePrecompiledHeader="0"
+				WarningLevel="2"
+				Detect64BitPortabilityProblems="false"
+				DebugInformationFormat="4"
+				ShowIncludes="false"
+			/>
+			<Tool
+				Name="VCManagedResourceCompilerTool"
+			/>
+			<Tool
+				Name="VCResourceCompilerTool"
+			/>
+			<Tool
+				Name="VCPreLinkEventTool"
+			/>
+			<Tool
+				Name="VCLibrarianTool"
+			/>
+			<Tool
+				Name="VCALinkTool"
+			/>
+			<Tool
+				Name="VCXDCMakeTool"
+			/>
+			<Tool
+				Name="VCBscMakeTool"
+			/>
+			<Tool
+				Name="VCFxCopTool"
+			/>
+			<Tool
+				Name="VCPostBuildEventTool"
+			/>
+		</Configuration>
+		<Configuration
+			Name="Release|Win32"
+			OutputDirectory="Release"
+			IntermediateDirectory="Release"
+			ConfigurationType="4"
+			>
+			<Tool
+				Name="VCPreBuildEventTool"
+			/>
+			<Tool
+				Name="VCCustomBuildTool"
+			/>
+			<Tool
+				Name="VCXMLDataGeneratorTool"
+			/>
+			<Tool
+				Name="VCWebServiceProxyGeneratorTool"
+			/>
+			<Tool
+				Name="VCMIDLTool"
+			/>
+			<Tool
+				Name="VCCLCompilerTool"
+				Optimization="3"
+				InlineFunctionExpansion="2"
+				EnableIntrinsicFunctions="true"
+				FavorSizeOrSpeed="1"
+				OmitFramePointers="true"
+				AdditionalIncludeDirectories=".;.\src"
+				PreprocessorDefinitions="__MSVC__;WIN32;NDEBUG;_LIB;NOMINMAX;_USE_MATH_DEFINES;USE_KISSFFT;NO_TIMING;USE_SPEEX;NO_THREAD_CHECKS"
+				RuntimeLibrary="2"
+				BufferSecurityCheck="false"
+				EnableEnhancedInstructionSet="1"
+				FloatingPointModel="2"
+				UsePrecompiledHeader="0"
+				WarningLevel="2"
+				Detect64BitPortabilityProblems="false"
+				DebugInformationFormat="3"
+			/>
+			<Tool
+				Name="VCManagedResourceCompilerTool"
+			/>
+			<Tool
+				Name="VCResourceCompilerTool"
+			/>
+			<Tool
+				Name="VCPreLinkEventTool"
+			/>
+			<Tool
+				Name="VCLibrarianTool"
+			/>
+			<Tool
+				Name="VCALinkTool"
+			/>
+			<Tool
+				Name="VCXDCMakeTool"
+			/>
+			<Tool
+				Name="VCBscMakeTool"
+			/>
+			<Tool
+				Name="VCFxCopTool"
+			/>
+			<Tool
+				Name="VCPostBuildEventTool"
+			/>
+		</Configuration>
+	</Configurations>
+	<References>
+	</References>
+	<Files>
+		<Filter
+			Name="Header Files"
+			Filter="h;hpp;hxx;hm;inl;inc;xsd"
+			UniqueIdentifier="{93995380-89BD-4b04-88EB-625FBE52EBFB}"
+			>
+			<File
+				RelativePath=".\src\system\Allocators.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\dsp\AudioCurveCalculator.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\audiocurves\CompoundAudioCurve.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\audiocurves\ConstantAudioCurve.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\dsp\FFT.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\float_cast\float_cast.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\audiocurves\HighFrequencyAudioCurve.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\dsp\MovingMedian.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\audiocurves\PercussiveAudioCurve.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\base\Profiler.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\dsp\Resampler.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\base\RingBuffer.h"
+				>
+			</File>
+			<File
+				RelativePath=".\rubberband\rubberband-c.h"
+				>
+			</File>
+			<File
+				RelativePath=".\rubberband\RubberBandStretcher.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\dsp\SampleFilter.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\base\Scavenger.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\audiocurves\SilentAudioCurve.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\audiocurves\SpectralDifferenceAudioCurve.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\speex\speex_resampler.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\StretchCalculator.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\StretcherChannelData.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\StretcherImpl.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\system\sysutils.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\system\Thread.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\system\VectorOps.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\dsp\SincWindow.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\dsp\Window.h"
+				>
+			</File>
+		</Filter>
+		<Filter
+			Name="Resource Files"
+			Filter="rc;ico;cur;bmp;dlg;rc2;rct;bin;rgs;gif;jpg;jpeg;jpe;resx"
+			UniqueIdentifier="{67DA6AB6-F800-4c08-8B7A-83BB121AAD01}"
+			>
+		</Filter>
+		<Filter
+			Name="Source Files"
+			Filter="cpp;c;cc;cxx;def;odl;idl;hpj;bat;asm;asmx"
+			UniqueIdentifier="{4FC737F1-C7A5-4376-A066-2A32D752A2FF}"
+			>
+			<File
+				RelativePath=".\src\system\Allocators.cpp"
+				>
+			</File>
+			<File
+				RelativePath=".\src\dsp\AudioCurveCalculator.cpp"
+				>
+			</File>
+			<File
+				RelativePath=".\src\audiocurves\CompoundAudioCurve.cpp"
+				>
+			</File>
+			<File
+				RelativePath=".\src\audiocurves\ConstantAudioCurve.cpp"
+				>
+			</File>
+			<File
+				RelativePath=".\src\dsp\FFT.cpp"
+				>
+			</File>
+			<File
+				RelativePath=".\src\audiocurves\HighFrequencyAudioCurve.cpp"
+				>
+			</File>
+			<File
+				RelativePath=".\src\audiocurves\PercussiveAudioCurve.cpp"
+				>
+			</File>
+			<File
+				RelativePath=".\src\base\Profiler.cpp"
+				>
+			</File>
+			<File
+				RelativePath=".\src\speex\resample.c"
+				>
+			</File>
+			<File
+				RelativePath=".\src\dsp\Resampler.cpp"
+				>
+			</File>
+			<File
+				RelativePath=".\src\rubberband-c.cpp"
+				>
+			</File>
+			<File
+				RelativePath=".\src\RubberBandStretcher.cpp"
+				>
+			</File>
+			<File
+				RelativePath=".\src\audiocurves\SilentAudioCurve.cpp"
+				>
+			</File>
+			<File
+				RelativePath=".\src\audiocurves\SpectralDifferenceAudioCurve.cpp"
+				>
+			</File>
+			<File
+				RelativePath=".\src\StretchCalculator.cpp"
+				>
+			</File>
+			<File
+				RelativePath=".\src\StretcherChannelData.cpp"
+				>
+			</File>
+			<File
+				RelativePath=".\src\StretcherImpl.cpp"
+				>
+			</File>
+			<File
+				RelativePath=".\src\StretcherProcess.cpp"
+				>
+			</File>
+			<File
+				RelativePath=".\src\system\sysutils.cpp"
+				>
+			</File>
+			<File
+				RelativePath=".\src\system\Thread.cpp"
+				>
+			</File>
+			<File
+				RelativePath=".\src\dsp\Window.cpp"
+				>
+			</File>
+		</Filter>
+	</Files>
+	<Globals>
+	</Globals>
+</VisualStudioProject>
diff --git a/rubberband-program.vcproj b/rubberband-program.vcproj
new file mode 100644
index 0000000..23cbb93
--- /dev/null
+++ b/rubberband-program.vcproj
@@ -0,0 +1,232 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<VisualStudioProject
+	ProjectType="Visual C++"
+	Version="9.00"
+	Name="rubberband-program"
+	ProjectGUID="{06838307-FEAA-4DB0-8E08-AF19698E9C40}"
+	RootNamespace="rubberband-program"
+	Keyword="Win32Proj"
+	TargetFrameworkVersion="0"
+	>
+	<Platforms>
+		<Platform
+			Name="Win32"
+		/>
+	</Platforms>
+	<ToolFiles>
+	</ToolFiles>
+	<Configurations>
+		<Configuration
+			Name="Debug|Win32"
+			OutputDirectory="Debug"
+			IntermediateDirectory="Debug"
+			ConfigurationType="1"
+			>
+			<Tool
+				Name="VCPreBuildEventTool"
+			/>
+			<Tool
+				Name="VCCustomBuildTool"
+			/>
+			<Tool
+				Name="VCXMLDataGeneratorTool"
+			/>
+			<Tool
+				Name="VCWebServiceProxyGeneratorTool"
+			/>
+			<Tool
+				Name="VCMIDLTool"
+			/>
+			<Tool
+				Name="VCCLCompilerTool"
+				Optimization="0"
+				AdditionalIncludeDirectories=".;.\rubberband;.\src;&quot;..\libsndfile-1_0_17&quot;"
+				PreprocessorDefinitions="__MSVC__;WIN32;_DEBUG;_CONSOLE"
+				MinimalRebuild="true"
+				BasicRuntimeChecks="3"
+				RuntimeLibrary="3"
+				UsePrecompiledHeader="0"
+				WarningLevel="3"
+				Detect64BitPortabilityProblems="true"
+				DebugInformationFormat="4"
+			/>
+			<Tool
+				Name="VCManagedResourceCompilerTool"
+			/>
+			<Tool
+				Name="VCResourceCompilerTool"
+			/>
+			<Tool
+				Name="VCPreLinkEventTool"
+			/>
+			<Tool
+				Name="VCLinkerTool"
+				AdditionalDependencies=".\Debug\rubberband-library.lib ..\libsndfile-1_0_17\libsndfile-1.lib"
+				LinkIncremental="2"
+				GenerateDebugInformation="true"
+				SubSystem="1"
+				TargetMachine="1"
+			/>
+			<Tool
+				Name="VCALinkTool"
+			/>
+			<Tool
+				Name="VCManifestTool"
+			/>
+			<Tool
+				Name="VCXDCMakeTool"
+			/>
+			<Tool
+				Name="VCBscMakeTool"
+			/>
+			<Tool
+				Name="VCFxCopTool"
+			/>
+			<Tool
+				Name="VCAppVerifierTool"
+			/>
+			<Tool
+				Name="VCPostBuildEventTool"
+			/>
+		</Configuration>
+		<Configuration
+			Name="Release|Win32"
+			OutputDirectory="Release"
+			IntermediateDirectory="Release"
+			ConfigurationType="1"
+			>
+			<Tool
+				Name="VCPreBuildEventTool"
+			/>
+			<Tool
+				Name="VCCustomBuildTool"
+			/>
+			<Tool
+				Name="VCXMLDataGeneratorTool"
+			/>
+			<Tool
+				Name="VCWebServiceProxyGeneratorTool"
+			/>
+			<Tool
+				Name="VCMIDLTool"
+			/>
+			<Tool
+				Name="VCCLCompilerTool"
+				Optimization="3"
+				InlineFunctionExpansion="2"
+				EnableIntrinsicFunctions="true"
+				FavorSizeOrSpeed="1"
+				OmitFramePointers="true"
+				AdditionalIncludeDirectories=".;.\rubberband;.\src;&quot;..\libsndfile-1_0_17&quot;"
+				PreprocessorDefinitions="__MSVC__;WIN32;NDEBUG;_CONSOLE;WANT_TIMING"
+				RuntimeLibrary="2"
+				EnableEnhancedInstructionSet="1"
+				FloatingPointModel="2"
+				UsePrecompiledHeader="0"
+				WarningLevel="2"
+				Detect64BitPortabilityProblems="false"
+				DebugInformationFormat="3"
+			/>
+			<Tool
+				Name="VCManagedResourceCompilerTool"
+			/>
+			<Tool
+				Name="VCResourceCompilerTool"
+			/>
+			<Tool
+				Name="VCPreLinkEventTool"
+			/>
+			<Tool
+				Name="VCLinkerTool"
+				AdditionalDependencies=".\Release\rubberband-library.lib ..\libsndfile-1_0_17\libsndfile-1.lib"
+				LinkIncremental="0"
+				GenerateDebugInformation="false"
+				SubSystem="1"
+				OptimizeReferences="2"
+				EnableCOMDATFolding="2"
+				TargetMachine="1"
+			/>
+			<Tool
+				Name="VCALinkTool"
+			/>
+			<Tool
+				Name="VCManifestTool"
+			/>
+			<Tool
+				Name="VCXDCMakeTool"
+			/>
+			<Tool
+				Name="VCBscMakeTool"
+			/>
+			<Tool
+				Name="VCFxCopTool"
+			/>
+			<Tool
+				Name="VCAppVerifierTool"
+			/>
+			<Tool
+				Name="VCPostBuildEventTool"
+			/>
+		</Configuration>
+	</Configurations>
+	<References>
+	</References>
+	<Files>
+		<Filter
+			Name="Header Files"
+			Filter="h;hpp;hxx;hm;inl;inc;xsd"
+			UniqueIdentifier="{93995380-89BD-4b04-88EB-625FBE52EBFB}"
+			>
+			<File
+				RelativePath=".\src\float_cast\float_cast.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\getopt\getopt.h"
+				>
+			</File>
+			<File
+				RelativePath=".\rubberband\RubberBandStretcher.h"
+				>
+			</File>
+			<File
+				RelativePath=".\src\getopt\unistd.h"
+				>
+			</File>
+		</Filter>
+		<Filter
+			Name="Resource Files"
+			Filter="rc;ico;cur;bmp;dlg;rc2;rct;bin;rgs;gif;jpg;jpeg;jpe;resx"
+			UniqueIdentifier="{67DA6AB6-F800-4c08-8B7A-83BB121AAD01}"
+			>
+		</Filter>
+		<Filter
+			Name="Source Files"
+			Filter="cpp;c;cc;cxx;def;odl;idl;hpj;bat;asm;asmx"
+			UniqueIdentifier="{4FC737F1-C7A5-4376-A066-2A32D752A2FF}"
+			>
+			<File
+				RelativePath=".\src\getopt\getopt.c"
+				>
+			</File>
+			<File
+				RelativePath=".\src\getopt\getopt_long.c"
+				>
+			</File>
+			<File
+				RelativePath=".\main\main.cpp"
+				>
+			</File>
+		</Filter>
+		<File
+			RelativePath=".\debug\BuildLog.htm"
+			>
+		</File>
+		<File
+			RelativePath=".\wub\index.html"
+			>
+		</File>
+	</Files>
+	<Globals>
+	</Globals>
+</VisualStudioProject>
diff --git a/rubberband/RubberBandStretcher.h b/rubberband/RubberBandStretcher.h
index a1d75a7..5c68af6 100644
--- a/rubberband/RubberBandStretcher.h
+++ b/rubberband/RubberBandStretcher.h
@@ -1,21 +1,30 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _RUBBERBANDSTRETCHER_H_
 #define _RUBBERBANDSTRETCHER_H_
     
-#define RUBBERBAND_VERSION "1.7-gpl"
+#define RUBBERBAND_VERSION "1.7"
 #define RUBBERBAND_API_MAJOR_VERSION 2
 #define RUBBERBAND_API_MINOR_VERSION 5
 
diff --git a/rubberband/rubberband-c.h b/rubberband/rubberband-c.h
index 6b74783..6113ebf 100644
--- a/rubberband/rubberband-c.h
+++ b/rubberband/rubberband-c.h
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _RUBBERBAND_C_API_H_
@@ -19,7 +28,7 @@
 extern "C" {
 #endif
 
-#define RUBBERBAND_VERSION "1.7-gpl"
+#define RUBBERBAND_VERSION "1.7"
 #define RUBBERBAND_API_MAJOR_VERSION 2
 #define RUBBERBAND_API_MINOR_VERSION 5
 
diff --git a/src/RubberBandStretcher.cpp b/src/RubberBandStretcher.cpp
index 0c24f47..dcc6b94 100644
--- a/src/RubberBandStretcher.cpp
+++ b/src/RubberBandStretcher.cpp
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include "StretcherImpl.h"
diff --git a/src/StretchCalculator.cpp b/src/StretchCalculator.cpp
index 87c4f3f..4494e66 100644
--- a/src/StretchCalculator.cpp
+++ b/src/StretchCalculator.cpp
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include "StretchCalculator.h"
diff --git a/src/StretchCalculator.h b/src/StretchCalculator.h
index 51558dc..7c6a68d 100644
--- a/src/StretchCalculator.h
+++ b/src/StretchCalculator.h
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _RUBBERBAND_STRETCH_CALCULATOR_H_
diff --git a/src/StretcherChannelData.cpp b/src/StretcherChannelData.cpp
index e1e0545..2c1225f 100644
--- a/src/StretcherChannelData.cpp
+++ b/src/StretcherChannelData.cpp
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include "StretcherChannelData.h"
diff --git a/src/StretcherChannelData.h b/src/StretcherChannelData.h
index c09653f..93b4ce8 100644
--- a/src/StretcherChannelData.h
+++ b/src/StretcherChannelData.h
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _RUBBERBAND_STRETCHERCHANNELDATA_H_
diff --git a/src/StretcherImpl.cpp b/src/StretcherImpl.cpp
index 27bce44..e868865 100644
--- a/src/StretcherImpl.cpp
+++ b/src/StretcherImpl.cpp
@@ -1,25 +1,35 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include "StretcherImpl.h"
 
-#include "dsp/PercussiveAudioCurve.h"
-#include "dsp/HighFrequencyAudioCurve.h"
-#include "dsp/SpectralDifferenceAudioCurve.h"
-#include "dsp/SilentAudioCurve.h"
-#include "dsp/ConstantAudioCurve.h"
-#include "dsp/CompoundAudioCurve.h"
+#include "audiocurves/PercussiveAudioCurve.h"
+#include "audiocurves/HighFrequencyAudioCurve.h"
+#include "audiocurves/SpectralDifferenceAudioCurve.h"
+#include "audiocurves/SilentAudioCurve.h"
+#include "audiocurves/ConstantAudioCurve.h"
+#include "audiocurves/CompoundAudioCurve.h"
+
 #include "dsp/Resampler.h"
 
 #include "StretchCalculator.h"
@@ -75,7 +85,9 @@ RubberBandStretcher::Impl::Impl(size_t sampleRate,
     m_outbufSize(m_defaultFftSize * 2),
     m_maxProcessSize(m_defaultFftSize),
     m_expectedInputDuration(0),
+#ifndef NO_THREADING
     m_threaded(false),
+#endif
     m_realtime(false),
     m_options(options),
     m_debugLevel(m_defaultDebugLevel),
@@ -84,7 +96,9 @@ RubberBandStretcher::Impl::Impl(size_t sampleRate,
     m_afilter(0),
     m_swindow(0),
     m_studyFFT(0),
+#ifndef NO_THREADING
     m_spaceAvailable("space"),
+#endif
     m_inputDuration(0),
     m_detectorType(CompoundAudioCurve::CompoundDetector),
     m_silentHistory(0),
@@ -145,6 +159,7 @@ RubberBandStretcher::Impl::Impl(size_t sampleRate,
         }
     }
 
+#ifndef NO_THREADING
     if (m_channels > 1) {
 
         m_threaded = true;
@@ -162,12 +177,14 @@ RubberBandStretcher::Impl::Impl(size_t sampleRate,
             cerr << "Going multithreaded..." << endl;
         }
     }
+#endif
 
     configure();
 }
 
 RubberBandStretcher::Impl::~Impl()
 {
+#ifndef NO_THREADING
     if (m_threaded) {
         MutexLocker locker(&m_threadSetMutex);
         for (set<ProcessThread *>::iterator i = m_threadSet.begin();
@@ -180,6 +197,7 @@ RubberBandStretcher::Impl::~Impl()
             delete *i;
         }
     }
+#endif
 
     for (size_t c = 0; c < m_channels; ++c) {
         delete m_channelData[c];
@@ -204,6 +222,7 @@ RubberBandStretcher::Impl::~Impl()
 void
 RubberBandStretcher::Impl::reset()
 {
+#ifndef NO_THREADING
     if (m_threaded) {
         m_threadSetMutex.lock();
         for (set<ProcessThread *>::iterator i = m_threadSet.begin();
@@ -217,6 +236,7 @@ RubberBandStretcher::Impl::reset()
         }
         m_threadSet.clear();
     }
+#endif
 
     m_emergencyScavenger.scavenge();
 
@@ -235,7 +255,9 @@ RubberBandStretcher::Impl::reset()
     m_inputDuration = 0;
     m_silentHistory = 0;
 
+#ifndef NO_THREADING
     if (m_threaded) m_threadSetMutex.unlock();
+#endif
 
     reconfigure();
 }
@@ -534,6 +556,7 @@ RubberBandStretcher::Impl::calculateSizes()
         // the pitch scale changes
         m_outbufSize = m_outbufSize * 16;
     } else {
+#ifndef NO_THREADING
         if (m_threaded) {
             // This headroom is to permit the processing threads to
             // run ahead of the buffer output drainage; the exact
@@ -541,6 +564,7 @@ RubberBandStretcher::Impl::calculateSizes()
             // results
             m_outbufSize = m_outbufSize * 16;
         }
+#endif
     }
 
     if (m_debugLevel > 0) {
@@ -1220,6 +1244,7 @@ RubberBandStretcher::Impl::process(const float *const *input, size_t samples, bo
             }
         }
 
+#ifndef NO_THREADING
         if (m_threaded) {
             MutexLocker locker(&m_threadSetMutex);
 
@@ -1233,6 +1258,7 @@ RubberBandStretcher::Impl::process(const float *const *input, size_t samples, bo
                 cerr << m_channels << " threads created" << endl;
             }
         }
+#endif
         
         m_mode = Processing;
     }
@@ -1270,7 +1296,9 @@ RubberBandStretcher::Impl::process(const float *const *input, size_t samples, bo
 //                cerr << "process: happy with channel " << c << endl;
             }
             if (
+#ifndef NO_THREADING
                 !m_threaded &&
+#endif
                 !m_realtime) {
                 bool any = false, last = false;
                 processChunks(c, any, last);
@@ -1284,6 +1312,7 @@ RubberBandStretcher::Impl::process(const float *const *input, size_t samples, bo
             // the realtime onset detector
             processOneChunk();
         }
+#ifndef NO_THREADING
         if (m_threaded) {
             for (ThreadSet::iterator i = m_threadSet.begin();
                  i != m_threadSet.end(); ++i) {
@@ -1295,6 +1324,7 @@ RubberBandStretcher::Impl::process(const float *const *input, size_t samples, bo
             }
             m_spaceAvailable.unlock();
         }
+#endif
 
         if (m_debugLevel > 2) {
             if (!allConsumed) cerr << "process looping" << endl;
diff --git a/src/StretcherImpl.h b/src/StretcherImpl.h
index 6d34991..82f0b32 100644
--- a/src/StretcherImpl.h
+++ b/src/StretcherImpl.h
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _RUBBERBAND_STRETCHERIMPL_H_
@@ -20,7 +29,8 @@
 #include "dsp/Window.h"
 #include "dsp/SincWindow.h"
 #include "dsp/FFT.h"
-#include "dsp/CompoundAudioCurve.h"
+
+#include "audiocurves/CompoundAudioCurve.h"
 
 #include "base/RingBuffer.h"
 #include "base/Scavenger.h"
@@ -34,7 +44,11 @@ using namespace RubberBand;
 namespace RubberBand
 {
 
+#ifdef PROCESS_SAMPLE_TYPE
+typedef PROCESS_SAMPLE_TYPE process_t;
+#else
 typedef double process_t;
+#endif
 
 class AudioCurveCalculator;
 class StretchCalculator;
@@ -161,7 +175,9 @@ protected:
     size_t m_maxProcessSize;
     size_t m_expectedInputDuration;
 
+#ifndef NO_THREADING    
     bool m_threaded;
+#endif
 
     bool m_realtime;
     Options m_options;
@@ -183,6 +199,7 @@ protected:
     Window<float> *m_swindow;
     FFT *m_studyFFT;
 
+#ifndef NO_THREADING
     Condition m_spaceAvailable;
     
     class ProcessThread : public Thread
@@ -203,6 +220,13 @@ protected:
     typedef std::set<ProcessThread *> ThreadSet;
     ThreadSet m_threadSet;
     
+#if defined HAVE_IPP && !defined USE_SPEEX
+    // Exasperatingly, the IPP polyphase resampler does not appear to
+    // be thread-safe as advertised -- a good reason to prefer the
+    // Speex alternative
+    Mutex m_resamplerMutex;
+#endif
+#endif
 
     size_t m_inputDuration;
     CompoundAudioCurve::Type m_detectorType;
diff --git a/src/StretcherProcess.cpp b/src/StretcherProcess.cpp
index b228596..cf5361a 100644
--- a/src/StretcherProcess.cpp
+++ b/src/StretcherProcess.cpp
@@ -1,22 +1,31 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include "StretcherImpl.h"
 
-#include "dsp/PercussiveAudioCurve.h"
-#include "dsp/HighFrequencyAudioCurve.h"
-#include "dsp/ConstantAudioCurve.h"
+#include "audiocurves/PercussiveAudioCurve.h"
+#include "audiocurves/HighFrequencyAudioCurve.h"
+#include "audiocurves/ConstantAudioCurve.h"
 
 #include "StretchCalculator.h"
 #include "StretcherChannelData.h"
@@ -42,6 +51,7 @@ using std::endl;
 
 namespace RubberBand {
 
+#ifndef NO_THREADING
 
 RubberBandStretcher::Impl::ProcessThread::ProcessThread(Impl *s, size_t c) :
     m_s(s),
@@ -117,6 +127,7 @@ RubberBandStretcher::Impl::ProcessThread::abandon()
     m_abandoning = true;
 }
 
+#endif
 
 bool
 RubberBandStretcher::Impl::resampleBeforeStretching() const
@@ -194,6 +205,13 @@ RubberBandStretcher::Impl::consumeChannel(size_t c,
             cd.setResampleBufSize(reqSize);
         }
 
+#ifndef NO_THREADING
+#if defined HAVE_IPP && !defined USE_SPEEX
+        if (m_threaded) {
+            m_resamplerMutex.lock();
+        }
+#endif
+#endif
 
         if (useMidSide) {
             ms = (float *)alloca(samples * sizeof(float));
@@ -209,6 +227,13 @@ RubberBandStretcher::Impl::consumeChannel(size_t c,
                                          1.0 / m_pitchScale,
                                          final);
 
+#ifndef NO_THREADING
+#if defined HAVE_IPP && !defined USE_SPEEX
+        if (m_threaded) {
+            m_resamplerMutex.unlock();
+        }
+#endif
+#endif
     }
 
     if (writable < toWrite) {
@@ -373,14 +398,18 @@ RubberBandStretcher::Impl::testInbufReadSpace(size_t c)
             // its input -- and that would give incorrect output, as
             // we know there is more input to come.
 
+#ifndef NO_THREADING
             if (!m_threaded) {
+#endif
                 if (m_debugLevel > 1) {
                     cerr << "WARNING: RubberBandStretcher: read space < chunk size ("
                          << inbuf.getReadSpace() << " < " << m_aWindowSize
                          << ") when not all input written, on processChunks for channel " << c << endl;
                 }
 
+#ifndef NO_THREADING
             }
+#endif
             return false;
         }
         
@@ -956,11 +985,13 @@ RubberBandStretcher::Impl::synthesiseChunk(size_t channel,
 
     if (!cd.unchanged) {
 
-        cd.fft->inversePolar(cd.mag, cd.phase, cd.dblbuf);
-
-        // our ffts produced unscaled results
+        // Our FFTs produced unscaled results. Scale before inverse
+        // transform rather than after, to avoid overflow if using a
+        // fixed-point FFT.
         float factor = 1.f / fsz;
-        v_scale(dblbuf, factor, fsz);
+        v_scale(cd.mag, factor, hs + 1);
+
+        cd.fft->inversePolar(cd.mag, cd.phase, cd.dblbuf);
 
         if (wsz == fsz) {
             v_convert(fltbuf, dblbuf + hs, hs);
@@ -1044,6 +1075,13 @@ RubberBandStretcher::Impl::writeChunk(size_t channel, size_t shiftIncrement, boo
             cd.setResampleBufSize(reqSize);
         }
 
+#ifndef NO_THREADING
+#if defined HAVE_IPP && !defined USE_SPEEX
+        if (m_threaded) {
+            m_resamplerMutex.lock();
+        }
+#endif
+#endif
 
         size_t outframes = cd.resampler->resample(&cd.accumulator,
                                                   &cd.resamplebuf,
@@ -1051,6 +1089,13 @@ RubberBandStretcher::Impl::writeChunk(size_t channel, size_t shiftIncrement, boo
                                                   1.0 / m_pitchScale,
                                                   last);
 
+#ifndef NO_THREADING
+#if defined HAVE_IPP && !defined USE_SPEEX
+        if (m_threaded) {
+            m_resamplerMutex.unlock();
+        }
+#endif
+#endif
 
         writeOutput(*cd.outbuf, cd.resamplebuf,
                     outframes, cd.outCount, theoreticalOut);
@@ -1158,14 +1203,18 @@ RubberBandStretcher::Impl::available() const
 {
     Profiler profiler("RubberBandStretcher::Impl::available");
 
+#ifndef NO_THREADING
     if (m_threaded) {
         MutexLocker locker(&m_threadSetMutex);
         if (m_channelData.empty()) return 0;
     } else {
         if (m_channelData.empty()) return 0;
     }
+#endif
 
+#ifndef NO_THREADING
     if (!m_threaded) {
+#endif
         for (size_t c = 0; c < m_channels; ++c) {
             if (m_channelData[c]->inputSize >= 0) {
 //                cerr << "available: m_done true" << endl;
@@ -1180,7 +1229,9 @@ RubberBandStretcher::Impl::available() const
                 }
             }
         }
+#ifndef NO_THREADING
     }
+#endif
 
     size_t min = 0;
     bool consumed = true;
diff --git a/src/dsp/CompoundAudioCurve.cpp b/src/audiocurves/CompoundAudioCurve.cpp
similarity index 86%
rename from src/dsp/CompoundAudioCurve.cpp
rename to src/audiocurves/CompoundAudioCurve.cpp
index fe7bb3b..ac47130 100644
--- a/src/dsp/CompoundAudioCurve.cpp
+++ b/src/audiocurves/CompoundAudioCurve.cpp
@@ -1,20 +1,29 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include "CompoundAudioCurve.h"
 
-#include "MovingMedian.h"
+#include "dsp/MovingMedian.h"
 
 #include <iostream>
 
diff --git a/src/dsp/CompoundAudioCurve.h b/src/audiocurves/CompoundAudioCurve.h
similarity index 72%
rename from src/dsp/CompoundAudioCurve.h
rename to src/audiocurves/CompoundAudioCurve.h
index d3b2b55..c5bf0d8 100644
--- a/src/dsp/CompoundAudioCurve.h
+++ b/src/audiocurves/CompoundAudioCurve.h
@@ -1,24 +1,33 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _COMPOUND_AUDIO_CURVE_H_
 #define _COMPOUND_AUDIO_CURVE_H_
 
-#include "AudioCurveCalculator.h"
+#include "dsp/AudioCurveCalculator.h"
 #include "PercussiveAudioCurve.h"
 #include "HighFrequencyAudioCurve.h"
-#include "SampleFilter.h"
+#include "dsp/SampleFilter.h"
 
 namespace RubberBand
 {
diff --git a/src/dsp/ConstantAudioCurve.cpp b/src/audiocurves/ConstantAudioCurve.cpp
similarity index 64%
rename from src/dsp/ConstantAudioCurve.cpp
rename to src/audiocurves/ConstantAudioCurve.cpp
index 19e2483..d1b6c91 100644
--- a/src/dsp/ConstantAudioCurve.cpp
+++ b/src/audiocurves/ConstantAudioCurve.cpp
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include "ConstantAudioCurve.h"
diff --git a/src/dsp/ConstantAudioCurve.h b/src/audiocurves/ConstantAudioCurve.h
similarity index 62%
rename from src/dsp/ConstantAudioCurve.h
rename to src/audiocurves/ConstantAudioCurve.h
index b9da9ff..e6b2054 100644
--- a/src/dsp/ConstantAudioCurve.h
+++ b/src/audiocurves/ConstantAudioCurve.h
@@ -1,21 +1,30 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _CONSTANT_AUDIO_CURVE_H_
 #define _CONSTANT_AUDIO_CURVE_H_
 
-#include "AudioCurveCalculator.h"
+#include "dsp/AudioCurveCalculator.h"
 
 namespace RubberBand
 {
diff --git a/src/dsp/HighFrequencyAudioCurve.cpp b/src/audiocurves/HighFrequencyAudioCurve.cpp
similarity index 71%
rename from src/dsp/HighFrequencyAudioCurve.cpp
rename to src/audiocurves/HighFrequencyAudioCurve.cpp
index eca62e3..f33eb4b 100644
--- a/src/dsp/HighFrequencyAudioCurve.cpp
+++ b/src/audiocurves/HighFrequencyAudioCurve.cpp
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include "HighFrequencyAudioCurve.h"
diff --git a/src/dsp/HighFrequencyAudioCurve.h b/src/audiocurves/HighFrequencyAudioCurve.h
similarity index 64%
rename from src/dsp/HighFrequencyAudioCurve.h
rename to src/audiocurves/HighFrequencyAudioCurve.h
index ac436b0..5172669 100644
--- a/src/dsp/HighFrequencyAudioCurve.h
+++ b/src/audiocurves/HighFrequencyAudioCurve.h
@@ -1,21 +1,30 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _HIGHFREQUENCY_AUDIO_CURVE_H_
 #define _HIGHFREQUENCY_AUDIO_CURVE_H_
 
-#include "AudioCurveCalculator.h"
+#include "dsp/AudioCurveCalculator.h"
 
 namespace RubberBand
 {
diff --git a/src/dsp/PercussiveAudioCurve.cpp b/src/audiocurves/PercussiveAudioCurve.cpp
similarity index 83%
rename from src/dsp/PercussiveAudioCurve.cpp
rename to src/audiocurves/PercussiveAudioCurve.cpp
index eb82f38..a1f76b0 100644
--- a/src/dsp/PercussiveAudioCurve.cpp
+++ b/src/audiocurves/PercussiveAudioCurve.cpp
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include "PercussiveAudioCurve.h"
diff --git a/src/dsp/PercussiveAudioCurve.h b/src/audiocurves/PercussiveAudioCurve.h
similarity index 66%
rename from src/dsp/PercussiveAudioCurve.h
rename to src/audiocurves/PercussiveAudioCurve.h
index 0030a2b..c6d2fbb 100644
--- a/src/dsp/PercussiveAudioCurve.h
+++ b/src/audiocurves/PercussiveAudioCurve.h
@@ -1,21 +1,30 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _PERCUSSIVE_AUDIO_CURVE_H_
 #define _PERCUSSIVE_AUDIO_CURVE_H_
 
-#include "AudioCurveCalculator.h"
+#include "dsp/AudioCurveCalculator.h"
 
 namespace RubberBand
 {
diff --git a/src/dsp/SilentAudioCurve.cpp b/src/audiocurves/SilentAudioCurve.cpp
similarity index 71%
rename from src/dsp/SilentAudioCurve.cpp
rename to src/audiocurves/SilentAudioCurve.cpp
index 4a83e8a..dbfd6bc 100644
--- a/src/dsp/SilentAudioCurve.cpp
+++ b/src/audiocurves/SilentAudioCurve.cpp
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include "SilentAudioCurve.h"
diff --git a/src/dsp/SilentAudioCurve.h b/src/audiocurves/SilentAudioCurve.h
similarity index 63%
rename from src/dsp/SilentAudioCurve.h
rename to src/audiocurves/SilentAudioCurve.h
index 258688f..4e07353 100644
--- a/src/dsp/SilentAudioCurve.h
+++ b/src/audiocurves/SilentAudioCurve.h
@@ -1,21 +1,30 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _SILENT_AUDIO_CURVE_H_
 #define _SILENT_AUDIO_CURVE_H_
 
-#include "AudioCurveCalculator.h"
+#include "dsp/AudioCurveCalculator.h"
 
 namespace RubberBand
 {
diff --git a/src/dsp/SpectralDifferenceAudioCurve.cpp b/src/audiocurves/SpectralDifferenceAudioCurve.cpp
similarity index 81%
rename from src/dsp/SpectralDifferenceAudioCurve.cpp
rename to src/audiocurves/SpectralDifferenceAudioCurve.cpp
index 5ac15d5..5afa810 100644
--- a/src/dsp/SpectralDifferenceAudioCurve.cpp
+++ b/src/audiocurves/SpectralDifferenceAudioCurve.cpp
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include "SpectralDifferenceAudioCurve.h"
diff --git a/src/dsp/SpectralDifferenceAudioCurve.h b/src/audiocurves/SpectralDifferenceAudioCurve.h
similarity index 66%
rename from src/dsp/SpectralDifferenceAudioCurve.h
rename to src/audiocurves/SpectralDifferenceAudioCurve.h
index 2d5b2d3..20f9c64 100644
--- a/src/dsp/SpectralDifferenceAudioCurve.h
+++ b/src/audiocurves/SpectralDifferenceAudioCurve.h
@@ -1,22 +1,31 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _SPECTRALDIFFERENCE_AUDIO_CURVE_H_
 #define _SPECTRALDIFFERENCE_AUDIO_CURVE_H_
 
-#include "AudioCurveCalculator.h"
-#include "Window.h"
+#include "dsp/AudioCurveCalculator.h"
+#include "dsp/Window.h"
 
 namespace RubberBand
 {
diff --git a/src/base/Profiler.cpp b/src/base/Profiler.cpp
index a890c29..e899b23 100644
--- a/src/base/Profiler.cpp
+++ b/src/base/Profiler.cpp
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include "Profiler.h"
@@ -19,7 +28,12 @@
 #include <string>
 #include <map>
 
-#include <cstdio>
+#include <stdio.h>
+
+#ifdef __MSVC__
+// Ugh --cc
+#define snprintf sprintf_s
+#endif
 
 namespace RubberBand {
 
@@ -53,39 +67,23 @@ Profiler::add(const char *id, float ms)
 void
 Profiler::dump()
 {
+    std::string report = getReport();
+    fprintf(stderr, "%s", report.c_str());
+}
+
+std::string
+Profiler::getReport()
+{
+    static const int buflen = 256;
+    char buffer[buflen];
+    std::string report;
+
 #ifdef PROFILE_CLOCKS
-    fprintf(stderr, "Profiling points [CPU time]:\n");
+    snprintf(buffer, buflen, "Profiling points [CPU time]:\n");
 #else
-    fprintf(stderr, "Profiling points [Wall time]:\n");
+    snprintf(buffer, buflen, "Profiling points [Wall time]:\n");
 #endif
-
-    fprintf(stderr, "\nBy name:\n");
-
-    typedef std::set<const char *, std::less<std::string> > StringSet;
-
-    StringSet profileNames;
-    for (ProfileMap::const_iterator i = m_profiles.begin();
-         i != m_profiles.end(); ++i) {
-        profileNames.insert(i->first);
-    }
-
-    for (StringSet::const_iterator i = profileNames.begin();
-         i != profileNames.end(); ++i) {
-
-        ProfileMap::const_iterator j = m_profiles.find(*i);
-        if (j == m_profiles.end()) continue;
-
-        const TimePair &pp(j->second);
-        fprintf(stderr, "%s(%d):\n", *i, pp.first);
-        fprintf(stderr, "\tReal: \t%f ms      \t[%f ms total]\n",
-                (pp.second / pp.first),
-                (pp.second));
-
-        WorstCallMap::const_iterator k = m_worstCalls.find(*i);
-        if (k == m_worstCalls.end()) continue;
-        
-        fprintf(stderr, "\tWorst:\t%f ms/call\n", k->second);
-    }
+    report += buffer;
 
     typedef std::multimap<float, const char *> TimeRMap;
     typedef std::multimap<int, const char *> IntRMap;
@@ -105,29 +103,71 @@ Profiler::dump()
         worstmap.insert(TimeRMap::value_type(i->second, i->first));
     }
 
-    fprintf(stderr, "\nBy total:\n");
+    snprintf(buffer, buflen, "\nBy total:\n");
+    report += buffer;
     for (TimeRMap::const_iterator i = totmap.end(); i != totmap.begin(); ) {
         --i;
-        fprintf(stderr, "%-40s  %f ms\n", i->second, i->first);
+        snprintf(buffer, buflen, "%-40s  %f ms\n", i->second, i->first);
+        report += buffer;
     }
 
-    fprintf(stderr, "\nBy average:\n");
+    snprintf(buffer, buflen, "\nBy average:\n");
+    report += buffer;
     for (TimeRMap::const_iterator i = avgmap.end(); i != avgmap.begin(); ) {
         --i;
-        fprintf(stderr, "%-40s  %f ms\n", i->second, i->first);
+        snprintf(buffer, buflen, "%-40s  %f ms\n", i->second, i->first);
+        report += buffer;
     }
 
-    fprintf(stderr, "\nBy worst case:\n");
+    snprintf(buffer, buflen, "\nBy worst case:\n");
+    report += buffer;
     for (TimeRMap::const_iterator i = worstmap.end(); i != worstmap.begin(); ) {
         --i;
-        fprintf(stderr, "%-40s  %f ms\n", i->second, i->first);
+        snprintf(buffer, buflen, "%-40s  %f ms\n", i->second, i->first);
+        report += buffer;
     }
 
-    fprintf(stderr, "\nBy number of calls:\n");
+    snprintf(buffer, buflen, "\nBy number of calls:\n");
+    report += buffer;
     for (IntRMap::const_iterator i = ncallmap.end(); i != ncallmap.begin(); ) {
         --i;
-        fprintf(stderr, "%-40s  %d\n", i->second, i->first);
+        snprintf(buffer, buflen, "%-40s  %d\n", i->second, i->first);
+        report += buffer;
     }
+
+    snprintf(buffer, buflen, "\nBy name:\n");
+    report += buffer;
+
+    typedef std::set<const char *, std::less<std::string> > StringSet;
+
+    StringSet profileNames;
+    for (ProfileMap::const_iterator i = m_profiles.begin();
+         i != m_profiles.end(); ++i) {
+        profileNames.insert(i->first);
+    }
+
+    for (StringSet::const_iterator i = profileNames.begin();
+         i != profileNames.end(); ++i) {
+
+        ProfileMap::const_iterator j = m_profiles.find(*i);
+        if (j == m_profiles.end()) continue;
+
+        const TimePair &pp(j->second);
+        snprintf(buffer, buflen, "%s(%d):\n", *i, pp.first);
+        report += buffer;
+        snprintf(buffer, buflen, "\tReal: \t%f ms      \t[%f ms total]\n",
+                (pp.second / pp.first),
+                (pp.second));
+        report += buffer;
+
+        WorstCallMap::const_iterator k = m_worstCalls.find(*i);
+        if (k == m_worstCalls.end()) continue;
+        
+        snprintf(buffer, buflen, "\tWorst:\t%f ms/call\n", k->second);
+        report += buffer;
+    }
+
+    return report;
 }
 
 Profiler::Profiler(const char* c) :
diff --git a/src/base/Profiler.h b/src/base/Profiler.h
index 7e3e967..e7bb4b9 100644
--- a/src/base/Profiler.h
+++ b/src/base/Profiler.h
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _PROFILER_H_
@@ -40,7 +49,10 @@
 #endif
 #endif
 
+#ifndef NO_TIMING
 #include <map>
+#include <string>
+#endif
 
 namespace RubberBand {
 
@@ -56,6 +68,12 @@ public:
 
     static void dump();
 
+    // Unlike the other functions, this is only defined if NO_TIMING
+    // is not set (because it uses std::string which is otherwise
+    // unused here). So, treat this as a tricksy internal function
+    // rather than an API call and guard any call to it appropriately.
+    static std::string getReport();
+
 protected:
     const char* m_c;
 #ifdef PROFILE_CLOCKS
diff --git a/src/base/RingBuffer.h b/src/base/RingBuffer.h
index faea993..dfad37f 100644
--- a/src/base/RingBuffer.h
+++ b/src/base/RingBuffer.h
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _RUBBERBAND_RINGBUFFER_H_
@@ -47,7 +56,7 @@ public:
      * power of two, this means n should ideally be some power of two
      * minus one.
      */
-    RingBuffer(int n = 0);
+    RingBuffer(int n);
 
     virtual ~RingBuffer();
 
@@ -268,8 +277,7 @@ RingBuffer<T>::reset()
     std::cerr << "RingBuffer<T>[" << this << "]::reset" << std::endl;
 #endif
 
-    m_writer = 0;
-    m_reader = 0;
+    m_reader = m_writer;
 }
 
 template <typename T>
@@ -298,7 +306,7 @@ RingBuffer<T>::read(S *const R__ destination, int n)
     if (n > available) {
 	std::cerr << "WARNING: RingBuffer::read: " << n << " requested, only "
                   << available << " available" << std::endl;
-        v_zero(destination + available, n - available);
+//!!!        v_zero(destination + available, n - available);
 	n = available;
     }
     if (n == 0) return n;
@@ -367,7 +375,7 @@ RingBuffer<T>::readOne()
     if (w == r) {
 	std::cerr << "WARNING: RingBuffer::readOne: no sample available"
 		  << std::endl;
-	return 0;
+	return T();
     }
 
     T value = m_buffer[r];
diff --git a/src/base/Scavenger.h b/src/base/Scavenger.h
index e1be933..a069056 100644
--- a/src/base/Scavenger.h
+++ b/src/base/Scavenger.h
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _RUBBERBAND_SCAVENGER_H_
diff --git a/src/dsp/AudioCurveCalculator.cpp b/src/dsp/AudioCurveCalculator.cpp
index ad37782..faaa277 100644
--- a/src/dsp/AudioCurveCalculator.cpp
+++ b/src/dsp/AudioCurveCalculator.cpp
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include "AudioCurveCalculator.h"
diff --git a/src/dsp/AudioCurveCalculator.h b/src/dsp/AudioCurveCalculator.h
index 2b2bc75..ecaf78c 100644
--- a/src/dsp/AudioCurveCalculator.h
+++ b/src/dsp/AudioCurveCalculator.h
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _AUDIO_CURVE_CALCULATOR_H_
@@ -93,6 +102,13 @@ public:
      */
     virtual double processDouble(const double *R__ mag, int increment) = 0;
 
+    /**
+     * Obtain a confidence for the curve value (if applicable). A
+     * value of 1.0 indicates perfect confidence in the curve
+     * calculation, 0.0 indicates none.
+     */
+    virtual double getConfidence() const { return 1.0; }
+
     /**
      * Reset the calculator, forgetting the history of the audio input
      * so far.
diff --git a/src/dsp/FFT.cpp b/src/dsp/FFT.cpp
index e4b69d1..7142093 100644
--- a/src/dsp/FFT.cpp
+++ b/src/dsp/FFT.cpp
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include "FFT.h"
@@ -21,25 +30,58 @@
 
 //#define FFT_MEASUREMENT 1
 
+#ifdef FFT_MEASUREMENT
+#include <sstream>
+#endif
+
+#ifdef HAVE_IPP
+#include <ipps.h>
+#endif
 
 #ifdef HAVE_FFTW3
 #include <fftw3.h>
 #endif
 
+#ifdef HAVE_VDSP
+#include <vecLib/vDSP.h>
+#include <vecLib/vForce.h>
+#endif
 
+#ifdef HAVE_MEDIALIB
+#include <mlib_signal.h>
+#endif
 
+#ifdef HAVE_OPENMAX
+#include <omxSP.h>
+#endif
+
+#ifdef HAVE_SFFT
+extern "C" {
+#include <sfft.h>
+}
+#endif
 
 #ifdef USE_KISSFFT
 #include "kissfft/kiss_fftr.h"
 #endif
 
+#ifndef HAVE_IPP
 #ifndef HAVE_FFTW3
 #ifndef USE_KISSFFT
 #ifndef USE_BUILTIN_FFT
+#ifndef HAVE_VDSP
+#ifndef HAVE_MEDIALIB
+#ifndef HAVE_OPENMAX
+#ifndef HAVE_SFFT
 #error No FFT implementation selected!
 #endif
 #endif
 #endif
+#endif
+#endif
+#endif
+#endif
+#endif
 
 #include <cmath>
 #include <iostream>
@@ -48,6 +90,11 @@
 #include <cstdlib>
 #include <vector>
 
+#ifdef FFT_MEASUREMENT
+#ifndef _WIN32
+#include <unistd.h>
+#endif
+#endif
 
 namespace RubberBand {
 
@@ -56,6 +103,8 @@ class FFTImpl
 public:
     virtual ~FFTImpl() { }
 
+    virtual FFT::Precisions getSupportedPrecisions() const = 0;
+
     virtual void initFloat() = 0;
     virtual void initDouble() = 0;
 
@@ -82,33 +131,1392 @@ public:
 
 namespace FFTs {
 
+#ifdef HAVE_IPP
 
+class D_IPP : public FFTImpl
+{
+public:
+    D_IPP(int size) :
+        m_size(size), m_fspec(0), m_dspec(0)
+    { 
+        for (int i = 0; ; ++i) {
+            if (m_size & (1 << i)) {
+                m_order = i;
+                break;
+            }
+        }
+    }
 
+    ~D_IPP() {
+        if (m_fspec) {
+            ippsFFTFree_R_32f(m_fspec);
+            ippsFree(m_fbuf);
+            ippsFree(m_fpacked);
+            ippsFree(m_fspare);
+        }
+        if (m_dspec) {
+            ippsFFTFree_R_64f(m_dspec);
+            ippsFree(m_dbuf);
+            ippsFree(m_dpacked);
+            ippsFree(m_dspare);
+        }
+    }
 
+    FFT::Precisions
+    getSupportedPrecisions() const {
+        return FFT::SinglePrecision | FFT::DoublePrecision;
+    }
+
+    //!!! rv check
+
+    void initFloat() {
+        if (m_fspec) return;
+        int specSize, specBufferSize, bufferSize;
+        ippsFFTGetSize_R_32f(m_order, IPP_FFT_NODIV_BY_ANY, ippAlgHintFast,
+                             &specSize, &specBufferSize, &bufferSize);
+        m_fbuf = ippsMalloc_8u(bufferSize);
+        m_fpacked = ippsMalloc_32f(m_size + 2);
+        m_fspare = ippsMalloc_32f(m_size / 2 + 1);
+        ippsFFTInitAlloc_R_32f(&m_fspec, m_order, IPP_FFT_NODIV_BY_ANY, 
+                               ippAlgHintFast);
+    }
+
+    void initDouble() {
+        if (m_dspec) return;
+        int specSize, specBufferSize, bufferSize;
+        ippsFFTGetSize_R_64f(m_order, IPP_FFT_NODIV_BY_ANY, ippAlgHintFast,
+                             &specSize, &specBufferSize, &bufferSize);
+        m_dbuf = ippsMalloc_8u(bufferSize);
+        m_dpacked = ippsMalloc_64f(m_size + 2);
+        m_dspare = ippsMalloc_64f(m_size / 2 + 1);
+        ippsFFTInitAlloc_R_64f(&m_dspec, m_order, IPP_FFT_NODIV_BY_ANY, 
+                               ippAlgHintFast);
+    }
+
+    void packFloat(const float *R__ re, const float *R__ im) {
+        Profiler profiler("D_IPP::packFloat");
+        int index = 0;
+        const int hs = m_size/2;
+        for (int i = 0; i <= hs; ++i) {
+            m_fpacked[index++] = re[i];
+            index++;
+        }
+        index = 0;
+        if (im) {
+            for (int i = 0; i <= hs; ++i) {
+                index++;
+                m_fpacked[index++] = im[i];
+            }
+        } else {
+            for (int i = 0; i <= hs; ++i) {
+                index++;
+                m_fpacked[index++] = 0.f;
+            }
+        }
+    }
+
+    void packDouble(const double *R__ re, const double *R__ im) {
+        Profiler profiler("D_IPP::packDouble");
+        int index = 0;
+        const int hs = m_size/2;
+        for (int i = 0; i <= hs; ++i) {
+            m_dpacked[index++] = re[i];
+            index++;
+        }
+        index = 0;
+        if (im) {
+            for (int i = 0; i <= hs; ++i) {
+                index++;
+                m_dpacked[index++] = im[i];
+            }
+        } else {
+            for (int i = 0; i <= hs; ++i) {
+                index++;
+                m_dpacked[index++] = 0.0;
+            }
+        }
+    }
+
+    void unpackFloat(float *re, float *R__ im) { // re may be equal to m_fpacked
+        Profiler profiler("D_IPP::unpackFloat");
+        int index = 0;
+        const int hs = m_size/2;
+        if (im) {
+            for (int i = 0; i <= hs; ++i) {
+                index++;
+                im[i] = m_fpacked[index++];
+            }
+        }
+        index = 0;
+        for (int i = 0; i <= hs; ++i) {
+            re[i] = m_fpacked[index++];
+            index++;
+        }
+    }        
+
+    void unpackDouble(double *re, double *R__ im) { // re may be equal to m_dpacked
+        Profiler profiler("D_IPP::unpackDouble");
+        int index = 0;
+        const int hs = m_size/2;
+        if (im) {
+            for (int i = 0; i <= hs; ++i) {
+                index++;
+                im[i] = m_dpacked[index++];
+            }
+        }
+        index = 0;
+        for (int i = 0; i <= hs; ++i) {
+            re[i] = m_dpacked[index++];
+            index++;
+        }
+    }        
+
+    void forward(const double *R__ realIn, double *R__ realOut, double *R__ imagOut) {
+        Profiler profiler("D_IPP::forward [d]");
+        if (!m_dspec) initDouble();
+        ippsFFTFwd_RToCCS_64f(realIn, m_dpacked, m_dspec, m_dbuf);
+        unpackDouble(realOut, imagOut);
+    }
+
+    void forwardInterleaved(const double *R__ realIn, double *R__ complexOut) {
+        Profiler profiler("D_IPP::forwardInterleaved [d]");
+        if (!m_dspec) initDouble();
+        ippsFFTFwd_RToCCS_64f(realIn, complexOut, m_dspec, m_dbuf);
+    }
+
+    void forwardPolar(const double *R__ realIn, double *R__ magOut, double *R__ phaseOut) {
+        Profiler profiler("D_IPP::forwardPolar [d]");
+        if (!m_dspec) initDouble();
+        ippsFFTFwd_RToCCS_64f(realIn, m_dpacked, m_dspec, m_dbuf);
+        unpackDouble(m_dpacked, m_dspare);
+        Profiler profiler2("D_IPP::forwardPolar [d] conv");
+        ippsCartToPolar_64f(m_dpacked, m_dspare, magOut, phaseOut, m_size/2+1);
+    }
+
+    void forwardMagnitude(const double *R__ realIn, double *R__ magOut) {
+        Profiler profiler("D_IPP::forwardMagnitude [d]");
+        if (!m_dspec) initDouble();
+        ippsFFTFwd_RToCCS_64f(realIn, m_dpacked, m_dspec, m_dbuf);
+        unpackDouble(m_dpacked, m_dspare);
+        ippsMagnitude_64f(m_dpacked, m_dspare, magOut, m_size/2+1);
+    }
+
+    void forward(const float *R__ realIn, float *R__ realOut, float *R__ imagOut) {
+        Profiler profiler("D_IPP::forward [f]");
+        if (!m_fspec) initFloat();
+        ippsFFTFwd_RToCCS_32f(realIn, m_fpacked, m_fspec, m_fbuf);
+        unpackFloat(realOut, imagOut);
+    }
+
+    void forwardInterleaved(const float *R__ realIn, float *R__ complexOut) {
+        Profiler profiler("D_IPP::forwardInterleaved [f]");
+        if (!m_fspec) initFloat();
+        ippsFFTFwd_RToCCS_32f(realIn, complexOut, m_fspec, m_fbuf);
+    }
+
+    void forwardPolar(const float *R__ realIn, float *R__ magOut, float *R__ phaseOut) {
+        Profiler profiler("D_IPP::forwardPolar [f]");
+        if (!m_fspec) initFloat();
+        ippsFFTFwd_RToCCS_32f(realIn, m_fpacked, m_fspec, m_fbuf);
+        unpackFloat(m_fpacked, m_fspare);
+        Profiler profiler2("D_IPP::forwardPolar [f] conv");
+        ippsCartToPolar_32f(m_fpacked, m_fspare, magOut, phaseOut, m_size/2+1);
+    }
+
+    void forwardMagnitude(const float *R__ realIn, float *R__ magOut) {
+        Profiler profiler("D_IPP::forwardMagnitude [f]");
+        if (!m_fspec) initFloat();
+        ippsFFTFwd_RToCCS_32f(realIn, m_fpacked, m_fspec, m_fbuf);
+        unpackFloat(m_fpacked, m_fspare);
+        ippsMagnitude_32f(m_fpacked, m_fspare, magOut, m_size/2+1);
+    }
+
+    void inverse(const double *R__ realIn, const double *R__ imagIn, double *R__ realOut) {
+        Profiler profiler("D_IPP::inverse [d]");
+        if (!m_dspec) initDouble();
+        packDouble(realIn, imagIn);
+        ippsFFTInv_CCSToR_64f(m_dpacked, realOut, m_dspec, m_dbuf);
+    }
+
+    void inverseInterleaved(const double *R__ complexIn, double *R__ realOut) {
+        Profiler profiler("D_IPP::inverse [d]");
+        if (!m_dspec) initDouble();
+        ippsFFTInv_CCSToR_64f(complexIn, realOut, m_dspec, m_dbuf);
+    }
+
+    void inversePolar(const double *R__ magIn, const double *R__ phaseIn, double *R__ realOut) {
+        Profiler profiler("D_IPP::inversePolar [d]");
+        if (!m_dspec) initDouble();
+        ippsPolarToCart_64f(magIn, phaseIn, realOut, m_dspare, m_size/2+1);
+        Profiler profiler2("D_IPP::inversePolar [d] postconv");
+        packDouble(realOut, m_dspare); // to m_dpacked
+        ippsFFTInv_CCSToR_64f(m_dpacked, realOut, m_dspec, m_dbuf);
+    }
+
+    void inverseCepstral(const double *R__ magIn, double *R__ cepOut) {
+        Profiler profiler("D_IPP::inverseCepstral [d]");
+        if (!m_dspec) initDouble();
+        const int hs1 = m_size/2 + 1;
+        ippsCopy_64f(magIn, m_dspare, hs1);
+        ippsAddC_64f_I(0.000001, m_dspare, hs1);
+        ippsLn_64f_I(m_dspare, hs1);
+        packDouble(m_dspare, 0);
+        ippsFFTInv_CCSToR_64f(m_dpacked, cepOut, m_dspec, m_dbuf);
+    }
+    
+    void inverse(const float *R__ realIn, const float *R__ imagIn, float *R__ realOut) {
+        Profiler profiler("D_IPP::inverse [f]");
+        if (!m_fspec) initFloat();
+        packFloat(realIn, imagIn);
+        ippsFFTInv_CCSToR_32f(m_fpacked, realOut, m_fspec, m_fbuf);
+    }
+
+    void inverseInterleaved(const float *R__ complexIn, float *R__ realOut) {
+        Profiler profiler("D_IPP::inverse [f]");
+        if (!m_fspec) initFloat();
+        ippsFFTInv_CCSToR_32f(complexIn, realOut, m_fspec, m_fbuf);
+    }
+
+    void inversePolar(const float *R__ magIn, const float *R__ phaseIn, float *R__ realOut) {
+        Profiler profiler("D_IPP::inversePolar [f]");
+        if (!m_fspec) initFloat();
+        ippsPolarToCart_32f(magIn, phaseIn, realOut, m_fspare, m_size/2+1);
+        Profiler profiler2("D_IPP::inversePolar [f] postconv");
+        packFloat(realOut, m_fspare); // to m_fpacked
+        ippsFFTInv_CCSToR_32f(m_fpacked, realOut, m_fspec, m_fbuf);
+    }
+
+    void inverseCepstral(const float *R__ magIn, float *R__ cepOut) {
+        Profiler profiler("D_IPP::inverseCepstral [f]");
+        if (!m_fspec) initFloat();
+        const int hs1 = m_size/2 + 1;
+        ippsCopy_32f(magIn, m_fspare, hs1);
+        ippsAddC_32f_I(0.000001f, m_fspare, hs1);
+        ippsLn_32f_I(m_fspare, hs1);
+        packFloat(m_fspare, 0);
+        ippsFFTInv_CCSToR_32f(m_fpacked, cepOut, m_fspec, m_fbuf);
+    }
+
+private:
+    const int m_size;
+    int m_order;
+    IppsFFTSpec_R_32f *m_fspec;
+    IppsFFTSpec_R_64f *m_dspec;
+    Ipp8u *m_fbuf;
+    Ipp8u *m_dbuf;
+    float *m_fpacked;
+    float *m_fspare;
+    double *m_dpacked;
+    double *m_dspare;
+};
+
+#endif /* HAVE_IPP */
+
+#ifdef HAVE_VDSP
+
+class D_VDSP : public FFTImpl
+{
+public:
+    D_VDSP(int size) :
+        m_size(size), m_fspec(0), m_dspec(0),
+        m_fpacked(0), m_fspare(0),
+        m_dpacked(0), m_dspare(0)
+    { 
+        for (int i = 0; ; ++i) {
+            if (m_size & (1 << i)) {
+                m_order = i;
+                break;
+            }
+        }
+    }
+
+    ~D_VDSP() {
+        if (m_fspec) {
+            vDSP_destroy_fftsetup(m_fspec);
+            deallocate(m_fspare);
+            deallocate(m_fspare2);
+            deallocate(m_fbuf->realp);
+            deallocate(m_fbuf->imagp);
+            delete m_fbuf;
+            deallocate(m_fpacked->realp);
+            deallocate(m_fpacked->imagp);
+            delete m_fpacked;
+        }
+        if (m_dspec) {
+            vDSP_destroy_fftsetupD(m_dspec);
+            deallocate(m_dspare);
+            deallocate(m_dspare2);
+            deallocate(m_dbuf->realp);
+            deallocate(m_dbuf->imagp);
+            delete m_dbuf;
+            deallocate(m_dpacked->realp);
+            deallocate(m_dpacked->imagp);
+            delete m_dpacked;
+        }
+    }
+
+    FFT::Precisions
+    getSupportedPrecisions() const {
+        return FFT::SinglePrecision | FFT::DoublePrecision;
+    }
+
+    //!!! rv check
+
+    void initFloat() {
+        if (m_fspec) return;
+        m_fspec = vDSP_create_fftsetup(m_order, FFT_RADIX2);
+        m_fbuf = new DSPSplitComplex;
+        //!!! "If possible, tempBuffer->realp and tempBuffer->imagp should be 32-byte aligned for best performance."
+        m_fbuf->realp = allocate<float>(m_size);
+        m_fbuf->imagp = allocate<float>(m_size);
+        m_fpacked = new DSPSplitComplex;
+        m_fpacked->realp = allocate<float>(m_size / 2 + 1);
+        m_fpacked->imagp = allocate<float>(m_size / 2 + 1);
+        m_fspare = allocate<float>(m_size + 2);
+        m_fspare2 = allocate<float>(m_size + 2);
+    }
+
+    void initDouble() {
+        if (m_dspec) return;
+        m_dspec = vDSP_create_fftsetupD(m_order, FFT_RADIX2);
+        m_dbuf = new DSPDoubleSplitComplex;
+        //!!! "If possible, tempBuffer->realp and tempBuffer->imagp should be 32-byte aligned for best performance."
+        m_dbuf->realp = allocate<double>(m_size);
+        m_dbuf->imagp = allocate<double>(m_size);
+        m_dpacked = new DSPDoubleSplitComplex;
+        m_dpacked->realp = allocate<double>(m_size / 2 + 1);
+        m_dpacked->imagp = allocate<double>(m_size / 2 + 1);
+        m_dspare = allocate<double>(m_size + 2);
+        m_dspare2 = allocate<double>(m_size + 2);
+    }
+
+    void packReal(const float *R__ const re) {
+        // Pack input for forward transform 
+        vDSP_ctoz((DSPComplex *)re, 2, m_fpacked, 1, m_size/2);
+    }
+    void packComplex(const float *R__ const re, const float *R__ const im) {
+        // Pack input for inverse transform 
+        if (re) v_copy(m_fpacked->realp, re, m_size/2 + 1);
+        else v_zero(m_fpacked->realp, m_size/2 + 1);
+        if (im) v_copy(m_fpacked->imagp, im, m_size/2 + 1);
+        else v_zero(m_fpacked->imagp, m_size/2 + 1);
+        fnyq();
+    }
+
+    void unpackReal(float *R__ const re) {
+        // Unpack output for inverse transform
+        vDSP_ztoc(m_fpacked, 1, (DSPComplex *)re, 2, m_size/2);
+    }
+    void unpackComplex(float *R__ const re, float *R__ const im) {
+        // Unpack output for forward transform
+        // vDSP forward FFTs are scaled 2x (for some reason)
+        float two = 2.f;
+        vDSP_vsdiv(m_fpacked->realp, 1, &two, re, 1, m_size/2 + 1);
+        vDSP_vsdiv(m_fpacked->imagp, 1, &two, im, 1, m_size/2 + 1);
+    }
+    void unpackComplex(float *R__ const cplx) {
+        // Unpack output for forward transform
+        // vDSP forward FFTs are scaled 2x (for some reason)
+        const int hs1 = m_size/2 + 1;
+        for (int i = 0; i < hs1; ++i) {
+            cplx[i*2] = m_fpacked->realp[i] / 2.f;
+            cplx[i*2+1] = m_fpacked->imagp[i] / 2.f;
+        }
+    }
+
+    void packReal(const double *R__ const re) {
+        // Pack input for forward transform
+        vDSP_ctozD((DSPDoubleComplex *)re, 2, m_dpacked, 1, m_size/2);
+    }
+    void packComplex(const double *R__ const re, const double *R__ const im) {
+        // Pack input for inverse transform
+        if (re) v_copy(m_dpacked->realp, re, m_size/2 + 1);
+        else v_zero(m_dpacked->realp, m_size/2 + 1);
+        if (im) v_copy(m_dpacked->imagp, im, m_size/2 + 1);
+        else v_zero(m_dpacked->imagp, m_size/2 + 1);
+        dnyq();
+    }
+
+    void unpackReal(double *R__ const re) {
+        // Unpack output for inverse transform
+        vDSP_ztocD(m_dpacked, 1, (DSPDoubleComplex *)re, 2, m_size/2);
+    }
+    void unpackComplex(double *R__ const re, double *R__ const im) {
+        // Unpack output for forward transform
+        // vDSP forward FFTs are scaled 2x (for some reason)
+        double two = 2.0;
+        vDSP_vsdivD(m_dpacked->realp, 1, &two, re, 1, m_size/2 + 1);
+        vDSP_vsdivD(m_dpacked->imagp, 1, &two, im, 1, m_size/2 + 1);
+    }
+    void unpackComplex(double *R__ const cplx) {
+        // Unpack output for forward transform
+        // vDSP forward FFTs are scaled 2x (for some reason)
+        const int hs1 = m_size/2 + 1;
+        for (int i = 0; i < hs1; ++i) {
+            cplx[i*2] = m_dpacked->realp[i] / 2.0;
+            cplx[i*2+1] = m_dpacked->imagp[i] / 2.0;
+        }
+    }
+
+    void fdenyq() {
+        // for fft result in packed form, unpack the DC and Nyquist bins
+        const int hs = m_size/2;
+        m_fpacked->realp[hs] = m_fpacked->imagp[0];
+        m_fpacked->imagp[hs] = 0.f;
+        m_fpacked->imagp[0] = 0.f;
+    }
+    void ddenyq() {
+        // for fft result in packed form, unpack the DC and Nyquist bins
+        const int hs = m_size/2;
+        m_dpacked->realp[hs] = m_dpacked->imagp[0];
+        m_dpacked->imagp[hs] = 0.;
+        m_dpacked->imagp[0] = 0.;
+    }
+
+    void fnyq() {
+        // for ifft input in packed form, pack the DC and Nyquist bins
+        const int hs = m_size/2;
+        m_fpacked->imagp[0] = m_fpacked->realp[hs];
+        m_fpacked->realp[hs] = 0.f;
+        m_fpacked->imagp[hs] = 0.f;
+    }
+    void dnyq() {
+        // for ifft input in packed form, pack the DC and Nyquist bins
+        const int hs = m_size/2;
+        m_dpacked->imagp[0] = m_dpacked->realp[hs];
+        m_dpacked->realp[hs] = 0.;
+        m_dpacked->imagp[hs] = 0.;
+    }
+
+    void forward(const double *R__ realIn, double *R__ realOut, double *R__ imagOut) {
+        Profiler profiler("D_VDSP::forward [d]");
+        if (!m_dspec) initDouble();
+        packReal(realIn);
+        vDSP_fft_zriptD(m_dspec, m_dpacked, 1, m_dbuf, m_order, FFT_FORWARD);
+        ddenyq();
+        unpackComplex(realOut, imagOut);
+    }
+
+    void forwardInterleaved(const double *R__ realIn, double *R__ complexOut) {
+        Profiler profiler("D_VDSP::forward [d]");
+        if (!m_dspec) initDouble();
+        packReal(realIn);
+        vDSP_fft_zriptD(m_dspec, m_dpacked, 1, m_dbuf, m_order, FFT_FORWARD);
+        ddenyq();
+        unpackComplex(complexOut);
+    }
+
+    void forwardPolar(const double *R__ realIn, double *R__ magOut, double *R__ phaseOut) {
+        Profiler profiler("D_VDSP::forwardPolar [d]");
+        if (!m_dspec) initDouble();
+        const int hs1 = m_size/2+1;
+        packReal(realIn);
+        vDSP_fft_zriptD(m_dspec, m_dpacked, 1, m_dbuf, m_order, FFT_FORWARD);
+        ddenyq();
+        // vDSP forward FFTs are scaled 2x (for some reason)
+        for (int i = 0; i < hs1; ++i) m_dpacked->realp[i] /= 2.0;
+        for (int i = 0; i < hs1; ++i) m_dpacked->imagp[i] /= 2.0;
+        v_cartesian_to_polar(magOut, phaseOut,
+                             m_dpacked->realp, m_dpacked->imagp, hs1);
+    }
+
+    void forwardMagnitude(const double *R__ realIn, double *R__ magOut) {
+        Profiler profiler("D_VDSP::forwardMagnitude [d]");
+        if (!m_dspec) initDouble();
+        packReal(realIn);
+        vDSP_fft_zriptD(m_dspec, m_dpacked, 1, m_dbuf, m_order, FFT_FORWARD);
+        ddenyq();
+        const int hs1 = m_size/2+1;
+        vDSP_zvmagsD(m_dpacked, 1, m_dspare, 1, hs1);
+        vvsqrt(m_dspare2, m_dspare, &hs1);
+        // vDSP forward FFTs are scaled 2x (for some reason)
+        double two = 2.0;
+        vDSP_vsdivD(m_dspare2, 1, &two, magOut, 1, hs1);
+    }
+
+    void forward(const float *R__ realIn, float *R__ realOut, float *R__ imagOut) {
+        Profiler profiler("D_VDSP::forward [f]");
+        if (!m_fspec) initFloat();
+        packReal(realIn);
+        vDSP_fft_zript(m_fspec, m_fpacked, 1, m_fbuf, m_order, FFT_FORWARD);
+        fdenyq();
+        unpackComplex(realOut, imagOut);
+    }
+
+    void forwardInterleaved(const float *R__ realIn, float *R__ complexOut) {
+        Profiler profiler("D_VDSP::forward [f]");
+        if (!m_fspec) initFloat();
+        packReal(realIn);
+        vDSP_fft_zript(m_fspec, m_fpacked, 1, m_fbuf, m_order, FFT_FORWARD);
+        fdenyq();
+        unpackComplex(complexOut);
+    }
+
+    void forwardPolar(const float *R__ realIn, float *R__ magOut, float *R__ phaseOut) {
+        Profiler profiler("D_VDSP::forwardPolar [f]");
+        if (!m_fspec) initFloat();
+        const int hs1 = m_size/2+1;
+        packReal(realIn);
+        vDSP_fft_zript(m_fspec, m_fpacked, 1, m_fbuf, m_order, FFT_FORWARD);
+        fdenyq();
+        // vDSP forward FFTs are scaled 2x (for some reason)
+        for (int i = 0; i < hs1; ++i) m_fpacked->realp[i] /= 2.f;
+        for (int i = 0; i < hs1; ++i) m_fpacked->imagp[i] /= 2.f;
+        v_cartesian_to_polar(magOut, phaseOut,
+                             m_fpacked->realp, m_fpacked->imagp, hs1);
+    }
+
+    void forwardMagnitude(const float *R__ realIn, float *R__ magOut) {
+        Profiler profiler("D_VDSP::forwardMagnitude [f]");
+        if (!m_fspec) initFloat();
+        packReal(realIn);
+        vDSP_fft_zript(m_fspec, m_fpacked, 1, m_fbuf, m_order, FFT_FORWARD);
+        fdenyq();
+        const int hs1 = m_size/2 + 1;
+        vDSP_zvmags(m_fpacked, 1, m_fspare, 1, hs1);
+        vvsqrtf(m_fspare2, m_fspare, &hs1);
+        // vDSP forward FFTs are scaled 2x (for some reason)
+        float two = 2.f;
+        vDSP_vsdiv(m_fspare2, 1, &two, magOut, 1, hs1);
+    }
+
+    void inverse(const double *R__ realIn, const double *R__ imagIn, double *R__ realOut) {
+        Profiler profiler("D_VDSP::inverse [d]");
+        if (!m_dspec) initDouble();
+        packComplex(realIn, imagIn);
+        vDSP_fft_zriptD(m_dspec, m_dpacked, 1, m_dbuf, m_order, FFT_INVERSE);
+        unpackReal(realOut);
+    }
+
+    void inverseInterleaved(const double *R__ complexIn, double *R__ realOut) {
+        Profiler profiler("D_VDSP::inverseInterleaved [d]");
+        if (!m_dspec) initDouble();
+        double *d[2] = { m_dpacked->realp, m_dpacked->imagp };
+        v_deinterleave(d, complexIn, 2, m_size/2 + 1);
+        vDSP_fft_zriptD(m_dspec, m_dpacked, 1, m_dbuf, m_order, FFT_INVERSE);
+        unpackReal(realOut);
+    }
+
+    void inversePolar(const double *R__ magIn, const double *R__ phaseIn, double *R__ realOut) {
+        Profiler profiler("D_VDSP::inversePolar [d]");
+        if (!m_dspec) initDouble();
+        const int hs1 = m_size/2+1;
+        vvsincos(m_dpacked->imagp, m_dpacked->realp, phaseIn, &hs1);
+        double *const rp = m_dpacked->realp;
+        double *const ip = m_dpacked->imagp;
+        for (int i = 0; i < hs1; ++i) rp[i] *= magIn[i];
+        for (int i = 0; i < hs1; ++i) ip[i] *= magIn[i];
+        dnyq();
+        vDSP_fft_zriptD(m_dspec, m_dpacked, 1, m_dbuf, m_order, FFT_INVERSE);
+        unpackReal(realOut);
+    }
+
+    void inverseCepstral(const double *R__ magIn, double *R__ cepOut) {
+        Profiler profiler("D_VDSP::inverseCepstral [d]");
+        if (!m_dspec) initDouble();
+        const int hs1 = m_size/2 + 1;
+        v_copy(m_dspare, magIn, hs1);
+        for (int i = 0; i < hs1; ++i) m_dspare[i] += 0.000001;
+        vvlog(m_dspare2, m_dspare, &hs1);
+        inverse(m_dspare2, 0, cepOut);
+    }
+    
+    void inverse(const float *R__ realIn, const float *R__ imagIn, float *R__ realOut) {
+        Profiler profiler("D_VDSP::inverse [f]");
+        if (!m_fspec) initFloat();
+        packComplex(realIn, imagIn);
+        vDSP_fft_zript(m_fspec, m_fpacked, 1, m_fbuf, m_order, FFT_INVERSE);
+        unpackReal(realOut);
+    }
+
+    void inverseInterleaved(const float *R__ complexIn, float *R__ realOut) {
+        Profiler profiler("D_VDSP::inverseInterleaved [f]");
+        if (!m_fspec) initFloat();
+        float *f[2] = { m_fpacked->realp, m_fpacked->imagp };
+        v_deinterleave(f, complexIn, 2, m_size/2 + 1);
+        vDSP_fft_zript(m_fspec, m_fpacked, 1, m_fbuf, m_order, FFT_INVERSE);
+        unpackReal(realOut);
+    }
+
+    void inversePolar(const float *R__ magIn, const float *R__ phaseIn, float *R__ realOut) {
+        Profiler profiler("D_VDSP::inversePolar [f]");
+        if (!m_fspec) initFloat();
+
+        const int hs1 = m_size/2+1;
+        vvsincosf(m_fpacked->imagp, m_fpacked->realp, phaseIn, &hs1);
+        float *const rp = m_fpacked->realp;
+        float *const ip = m_fpacked->imagp;
+        for (int i = 0; i < hs1; ++i) rp[i] *= magIn[i];
+        for (int i = 0; i < hs1; ++i) ip[i] *= magIn[i];
+        fnyq();
+        vDSP_fft_zript(m_fspec, m_fpacked, 1, m_fbuf, m_order, FFT_INVERSE);
+        unpackReal(realOut);
+    }
+
+    void inverseCepstral(const float *R__ magIn, float *R__ cepOut) {
+        Profiler profiler("D_VDSP::inverseCepstral [f]");
+        if (!m_fspec) initFloat();
+        const int hs1 = m_size/2 + 1;
+        v_copy(m_fspare, magIn, hs1);
+        for (int i = 0; i < hs1; ++i) m_fspare[i] += 0.000001f;
+        vvlogf(m_fspare2, m_fspare, &hs1);
+        inverse(m_fspare2, 0, cepOut);
+    }
+
+private:
+    const int m_size;
+    int m_order;
+    FFTSetup m_fspec;
+    FFTSetupD m_dspec;
+    DSPSplitComplex *m_fbuf;
+    DSPDoubleSplitComplex *m_dbuf;
+    DSPSplitComplex *m_fpacked;
+    float *m_fspare;
+    float *m_fspare2;
+    DSPDoubleSplitComplex *m_dpacked;
+    double *m_dspare;
+    double *m_dspare2;
+};
+
+#endif /* HAVE_VDSP */
+
+#ifdef HAVE_MEDIALIB
+
+class D_MEDIALIB : public FFTImpl
+{
+public:
+    D_MEDIALIB(int size) :
+        m_size(size),
+        m_dpacked(0), m_fpacked(0)
+    { 
+        for (int i = 0; ; ++i) {
+            if (m_size & (1 << i)) {
+                m_order = i;
+                break;
+            }
+        }
+    }
+
+    ~D_MEDIALIB() {
+        if (m_dpacked) {
+            deallocate(m_dpacked);
+        }
+        if (m_fpacked) {
+            deallocate(m_fpacked);
+        }
+    }
+
+    FFT::Precisions
+    getSupportedPrecisions() const {
+        return FFT::SinglePrecision | FFT::DoublePrecision;
+    }
+
+    //!!! rv check
+
+    void initFloat() {
+        m_fpacked = allocate<float>(m_size*2);
+    }
+
+    void initDouble() {
+        m_dpacked = allocate<double>(m_size*2);
+    }
+
+    void packFloatConjugates() {
+        const int hs = m_size / 2;
+        for (int i = 1; i <= hs; ++i) {
+            m_fpacked[(m_size-i)*2] = m_fpacked[2*i];
+            m_fpacked[(m_size-i)*2 + 1] = -m_fpacked[2*i + 1];
+        }
+    }
+
+    void packDoubleConjugates() {
+        const int hs = m_size / 2;
+        for (int i = 1; i <= hs; ++i) {
+            m_dpacked[(m_size-i)*2] = m_dpacked[2*i];
+            m_dpacked[(m_size-i)*2 + 1] = -m_dpacked[2*i + 1];
+        }
+    }
+
+    void packFloat(const float *R__ re, const float *R__ im) {
+        int index = 0;
+        const int hs = m_size/2;
+        for (int i = 0; i <= hs; ++i) {
+            m_fpacked[index++] = re[i];
+            index++;
+        }
+        index = 0;
+        if (im) {
+            for (int i = 0; i <= hs; ++i) {
+                index++;
+                m_fpacked[index++] = im[i];
+            }
+        } else {
+            for (int i = 0; i <= hs; ++i) {
+                index++;
+                m_fpacked[index++] = 0.f;
+            }
+        }
+        packFloatConjugates();
+    }
+
+    void packDouble(const double *R__ re, const double *R__ im) {
+        int index = 0;
+        const int hs = m_size/2;
+        for (int i = 0; i <= hs; ++i) {
+            m_dpacked[index++] = re[i];
+            index++;
+        }
+        index = 0;
+        if (im) {
+            for (int i = 0; i <= hs; ++i) {
+                index++;
+                m_dpacked[index++] = im[i];
+            }
+        } else {
+            for (int i = 0; i <= hs; ++i) {
+                index++;
+                m_dpacked[index++] = 0.0;
+            }
+        }
+        packDoubleConjugates();
+    }
+
+    void unpackFloat(float *re, float *R__ im) { // re may be equal to m_fpacked
+        int index = 0;
+        const int hs = m_size/2;
+        if (im) {
+            for (int i = 0; i <= hs; ++i) {
+                index++;
+                im[i] = m_fpacked[index++];
+            }
+        }
+        index = 0;
+        for (int i = 0; i <= hs; ++i) {
+            re[i] = m_fpacked[index++];
+            index++;
+        }
+    }        
+
+    void unpackDouble(double *re, double *R__ im) { // re may be equal to m_dpacked
+        int index = 0;
+        const int hs = m_size/2;
+        if (im) {
+            for (int i = 0; i <= hs; ++i) {
+                index++;
+                im[i] = m_dpacked[index++];
+            }
+        }
+        index = 0;
+        for (int i = 0; i <= hs; ++i) {
+            re[i] = m_dpacked[index++];
+            index++;
+        }
+    }
+
+    void forward(const double *R__ realIn, double *R__ realOut, double *R__ imagOut) {
+        Profiler profiler("D_MEDIALIB::forward [d]");
+        if (!m_dpacked) initDouble();
+        mlib_SignalFFT_1_D64C_D64(m_dpacked, realIn, m_order);
+        unpackDouble(realOut, imagOut);
+    }
+
+    void forwardInterleaved(const double *R__ realIn, double *R__ complexOut) {
+        Profiler profiler("D_MEDIALIB::forwardInterleaved [d]");
+        if (!m_dpacked) initDouble();
+        // mlib FFT gives the whole redundant complex result
+        mlib_SignalFFT_1_D64C_D64(m_dpacked, realIn, m_order);
+        v_copy(complexOut, m_dpacked, m_size + 2);
+    }
+
+    void forwardPolar(const double *R__ realIn, double *R__ magOut, double *R__ phaseOut) {
+        Profiler profiler("D_MEDIALIB::forwardPolar [d]");
+        if (!m_dpacked) initDouble();
+        mlib_SignalFFT_1_D64C_D64(m_dpacked, realIn, m_order);
+        const int hs = m_size/2;
+        int index = 0;
+        for (int i = 0; i <= hs; ++i) {
+            int reali = index;
+            ++index;
+            magOut[i] = sqrt(m_dpacked[reali] * m_dpacked[reali] +
+                             m_dpacked[index] * m_dpacked[index]);
+            phaseOut[i] = atan2(m_dpacked[index], m_dpacked[reali]) ;
+            ++index;
+        }
+    }
+
+    void forwardMagnitude(const double *R__ realIn, double *R__ magOut) {
+        Profiler profiler("D_MEDIALIB::forwardMagnitude [d]");
+        if (!m_dpacked) initDouble();
+        mlib_SignalFFT_1_D64C_D64(m_dpacked, realIn, m_order);
+        const int hs = m_size/2;
+        int index = 0;
+        for (int i = 0; i <= hs; ++i) {
+            int reali = index;
+            ++index;
+            magOut[i] = sqrt(m_dpacked[reali] * m_dpacked[reali] +
+                             m_dpacked[index] * m_dpacked[index]);
+            ++index;
+        }
+    }
+
+    void forward(const float *R__ realIn, float *R__ realOut, float *R__ imagOut) {
+        Profiler profiler("D_MEDIALIB::forward [f]");
+        if (!m_fpacked) initFloat();
+        mlib_SignalFFT_1_F32C_F32(m_fpacked, realIn, m_order);
+        unpackFloat(realOut, imagOut);
+    }
+
+    void forwardInterleaved(const float *R__ realIn, float *R__ complexOut) {
+        Profiler profiler("D_MEDIALIB::forwardInterleaved [f]");
+        if (!m_fpacked) initFloat();
+        // mlib FFT gives the whole redundant complex result
+        mlib_SignalFFT_1_F32C_F32(m_fpacked, realIn, m_order);
+        v_copy(complexOut, m_fpacked, m_size + 2);
+    }
+
+    void forwardPolar(const float *R__ realIn, float *R__ magOut, float *R__ phaseOut) {
+        Profiler profiler("D_MEDIALIB::forwardPolar [f]");
+        if (!m_fpacked) initFloat();
+        mlib_SignalFFT_1_F32C_F32(m_fpacked, realIn, m_order);
+        const int hs = m_size/2;
+        int index = 0;
+        for (int i = 0; i <= hs; ++i) {
+            int reali = index;
+            ++index;
+            magOut[i] = sqrtf(m_fpacked[reali] * m_fpacked[reali] +
+                              m_fpacked[index] * m_fpacked[index]);
+            phaseOut[i] = atan2f(m_fpacked[index], m_fpacked[reali]);
+            ++index;
+        }
+    }
+
+    void forwardMagnitude(const float *R__ realIn, float *R__ magOut) {
+        Profiler profiler("D_MEDIALIB::forwardMagnitude [f]");
+        if (!m_fpacked) initFloat();
+        mlib_SignalFFT_1_F32C_F32(m_fpacked, realIn, m_order);
+        const int hs = m_size/2;
+        int index = 0;
+        for (int i = 0; i <= hs; ++i) {
+            int reali = index;
+            ++index;
+            magOut[i] = sqrtf(m_fpacked[reali] * m_fpacked[reali] +
+                              m_fpacked[index] * m_fpacked[index]);
+            ++index;
+        }
+    }
+
+    void inverse(const double *R__ realIn, const double *R__ imagIn, double *R__ realOut) {
+        Profiler profiler("D_MEDIALIB::inverse [d]");
+        if (!m_dpacked) initDouble();
+        packDouble(realIn, imagIn);
+        mlib_SignalIFFT_2_D64_D64C(realOut, m_dpacked, m_order);
+    }
+
+    void inverseInterleaved(const double *R__ complexIn, double *R__ realOut) {
+        Profiler profiler("D_MEDIALIB::inverseInterleaved [d]");
+        if (!m_dpacked) initDouble();
+        v_copy(m_dpacked, complexIn, m_size + 2);
+        packDoubleConjugates();
+        mlib_SignalIFFT_2_D64_D64C(realOut, m_dpacked, m_order);
+    }
+
+    void inversePolar(const double *R__ magIn, const double *R__ phaseIn, double *R__ realOut) {
+        Profiler profiler("D_MEDIALIB::inversePolar [d]");
+        if (!m_dpacked) initDouble();
+        const int hs = m_size/2;
+        for (int i = 0; i <= hs; ++i) {
+            double real = magIn[i] * cos(phaseIn[i]);
+            double imag = magIn[i] * sin(phaseIn[i]);
+            m_dpacked[i*2] = real;
+            m_dpacked[i*2 + 1] = imag;
+        }
+        packDoubleConjugates();
+        mlib_SignalIFFT_2_D64_D64C(realOut, m_dpacked, m_order);
+    }
+
+    void inverseCepstral(const double *R__ magIn, double *R__ cepOut) {
+        Profiler profiler("D_MEDIALIB::inverseCepstral [d]");
+        if (!m_dpacked) initDouble();
+        const int hs = m_size/2;
+        for (int i = 0; i <= hs; ++i) {
+            m_dpacked[i*2] = log(magIn[i] + 0.000001);
+            m_dpacked[i*2 + 1] = 0.0;
+        }
+        packDoubleConjugates();
+        mlib_SignalIFFT_2_D64_D64C(cepOut, m_dpacked, m_order);
+    }
+    
+    void inverse(const float *R__ realIn, const float *R__ imagIn, float *R__ realOut) {
+        Profiler profiler("D_MEDIALIB::inverse [f]");
+        if (!m_fpacked) initFloat();
+        packFloat(realIn, imagIn);
+        mlib_SignalIFFT_2_F32_F32C(realOut, m_fpacked, m_order);
+    }
+    
+    void inverseInterleaved(const float *R__ complexIn, float *R__ realOut) {
+        Profiler profiler("D_MEDIALIB::inverseInterleaved [f]");
+        if (!m_fpacked) initFloat();
+        v_convert(m_fpacked, complexIn, m_size + 2);
+        packFloatConjugates();
+        mlib_SignalIFFT_2_F32_F32C(realOut, m_fpacked, m_order);
+    }
+
+    void inversePolar(const float *R__ magIn, const float *R__ phaseIn, float *R__ realOut) {
+        Profiler profiler("D_MEDIALIB::inversePolar [f]");
+        if (!m_fpacked) initFloat();
+        const int hs = m_size/2;
+        for (int i = 0; i <= hs; ++i) {
+            double real = magIn[i] * cos(phaseIn[i]);
+            double imag = magIn[i] * sin(phaseIn[i]);
+            m_fpacked[i*2] = real;
+            m_fpacked[i*2 + 1] = imag;
+        }
+        packFloatConjugates();
+        mlib_SignalIFFT_2_F32_F32C(realOut, m_fpacked, m_order);
+    }
+
+    void inverseCepstral(const float *R__ magIn, float *R__ cepOut) {
+        Profiler profiler("D_MEDIALIB::inverseCepstral [f]");
+        if (!m_fpacked) initFloat();
+        const int hs = m_size/2;
+        for (int i = 0; i <= hs; ++i) {
+            m_fpacked[i*2] = logf(magIn[i] + 0.000001);
+            m_fpacked[i*2 + 1] = 0.f;
+        }
+        packFloatConjugates();
+        mlib_SignalIFFT_2_F32_F32C(cepOut, m_fpacked, m_order);
+    }
+
+private:
+    const int m_size;
+    int m_order;
+    double *m_dpacked;
+    float *m_fpacked;
+};
+
+#endif /* HAVE_MEDIALIB */
+
+#ifdef HAVE_OPENMAX
+
+class D_OPENMAX : public FFTImpl
+{
+    // Convert a signed 32-bit integer to a float in the range [-1,1)
+    static inline float i2f(OMX_S32 i)
+    {
+        return float(i) / float(OMX_MAX_S32);
+    }
+
+    // Convert a signed 32-bit integer to a double in the range [-1,1)
+    static inline double i2d(OMX_S32 i)
+    {
+        return double(i) / double(OMX_MAX_S32);
+    }
+
+    // Convert a float in the range [-1,1) to a signed 32-bit integer
+    static inline OMX_S32 f2i(float f)
+    {
+        return OMX_S32(f * OMX_MAX_S32);
+    }
+
+    // Convert a double in the range [-1,1) to a signed 32-bit integer
+    static inline OMX_S32 d2i(double d)
+    {
+        return OMX_S32(d * OMX_MAX_S32);
+    }
+
+public:
+    D_OPENMAX(int size) :
+        m_size(size),
+        m_packed(0)
+    { 
+        for (int i = 0; ; ++i) {
+            if (m_size & (1 << i)) {
+                m_order = i;
+                break;
+            }
+        }
+    }
+
+    ~D_OPENMAX() {
+        if (m_packed) {
+            deallocate(m_packed);
+            deallocate(m_buf);
+            deallocate(m_fbuf);
+            deallocate(m_spec);
+        }
+    }
+
+    FFT::Precisions
+    getSupportedPrecisions() const {
+        return FFT::SinglePrecision;
+    }
+
+    //!!! rv check
+
+    // The OpenMAX implementation uses a fixed-point representation in
+    // 32-bit signed integers, with a downward scaling factor (0-32
+    // bits) supplied as an argument to the FFT function.
+
+    void initFloat() {
+        initDouble();
+    }
+
+    void initDouble() {
+        if (!m_packed) {
+            m_buf = allocate<OMX_S32>(m_size);
+            m_packed = allocate<OMX_S32>(m_size*2 + 2);
+            m_fbuf = allocate<float>(m_size*2 + 2);
+            OMX_INT sz = 0;
+            omxSP_FFTGetBufSize_R_S32(m_order, &sz);
+            m_spec = (OMXFFTSpec_R_S32 *)allocate<char>(sz);
+            omxSP_FFTInit_R_S32(m_spec, m_order);
+        }
+    }
+
+    void packFloat(const float *R__ re) {
+        // prepare fixed point input for forward transform
+        for (int i = 0; i < m_size; ++i) {
+            m_buf[i] = f2i(re[i]);
+        }
+    }
+
+    void packDouble(const double *R__ re) {
+        // prepare fixed point input for forward transform
+        for (int i = 0; i < m_size; ++i) {
+            m_buf[i] = d2i(re[i]);
+        }
+    }
+
+    void unpackFloat(float *R__ re, float *R__ im) {
+        // convert fixed point output for forward transform
+        int index = 0;
+        const int hs = m_size/2;
+        if (im) {
+            for (int i = 0; i <= hs; ++i) {
+                index++;
+                im[i] = i2f(m_packed[index++]);
+            }
+            v_scale(im, m_size, hs + 1);
+        }
+        index = 0;
+        for (int i = 0; i <= hs; ++i) {
+            re[i] = i2f(m_packed[index++]);
+            index++;
+        }
+        v_scale(re, m_size, hs + 1);
+    }        
+
+    void unpackDouble(double *R__ re, double *R__ im) {
+        // convert fixed point output for forward transform
+        int index = 0;
+        const int hs = m_size/2;
+        if (im) {
+            for (int i = 0; i <= hs; ++i) {
+                index++;
+                im[i] = i2d(m_packed[index++]);
+            }
+            v_scale(im, m_size, hs + 1);
+        }
+        index = 0;
+        for (int i = 0; i <= hs; ++i) {
+            re[i] = i2d(m_packed[index++]);
+            index++;
+        }
+        v_scale(re, m_size, hs + 1);
+    }
+
+    void unpackFloatInterleaved(float *R__ cplx) {
+        // convert fixed point output for forward transform
+        for (int i = 0; i < m_size + 2; ++i) {
+            cplx[i] = i2f(m_packed[i]);
+        }            
+        v_scale(cplx, m_size, m_size + 2);
+    }
+
+    void unpackDoubleInterleaved(double *R__ cplx) {
+        // convert fixed point output for forward transform
+        for (int i = 0; i < m_size + 2; ++i) {
+            cplx[i] = i2d(m_packed[i]);
+        }            
+        v_scale(cplx, m_size, m_size + 2);
+    }
+
+    void packFloat(const float *R__ re, const float *R__ im) {
+        // prepare fixed point input for inverse transform
+        int index = 0;
+        const int hs = m_size/2;
+        for (int i = 0; i <= hs; ++i) {
+            m_packed[index++] = f2i(re[i]);
+            index++;
+        }
+        index = 0;
+        if (im) {
+            for (int i = 0; i <= hs; ++i) {
+                index++;
+                m_packed[index++] = f2i(im[i]);
+            }
+        } else {
+            for (int i = 0; i <= hs; ++i) {
+                index++;
+                m_packed[index++] = 0;
+            }
+        }
+    }
+
+    void packDouble(const double *R__ re, const double *R__ im) {
+        // prepare fixed point input for inverse transform
+        int index = 0;
+        const int hs = m_size/2;
+        for (int i = 0; i <= hs; ++i) {
+            m_packed[index++] = d2i(re[i]);
+            index++;
+        }
+        index = 0;
+        if (im) {
+            for (int i = 0; i <= hs; ++i) {
+                index++;
+                m_packed[index++] = d2i(im[i]);
+            }
+        } else {
+            for (int i = 0; i <= hs; ++i) {
+                index++;
+                m_packed[index++] = 0;
+            }
+        }
+    }
+
+    void convertFloat(const float *R__ f) {
+        // convert interleaved input for inverse interleaved transform
+        const int n = m_size + 2;
+        for (int i = 0; i < n; ++i) {
+            m_packed[i] = f2i(f[i]);
+        }
+    }        
+
+    void convertDouble(const double *R__ d) {
+        // convert interleaved input for inverse interleaved transform
+        const int n = m_size + 2;
+        for (int i = 0; i < n; ++i) {
+            m_packed[i] = d2i(d[i]);
+        }
+    }        
+
+    void unpackFloat(float *R__ re) {
+        // convert fixed point output for inverse transform
+        for (int i = 0; i < m_size; ++i) {
+            re[i] = i2f(m_buf[i]) * m_size;
+        }
+    }
+
+    void unpackDouble(double *R__ re) {
+        // convert fixed point output for inverse transform
+        for (int i = 0; i < m_size; ++i) {
+            re[i] = i2d(m_buf[i]) * m_size;
+        }
+    }
+
+    void forward(const double *R__ realIn, double *R__ realOut, double *R__ imagOut) {
+        Profiler profiler("D_OPENMAX::forward [d]");
+        if (!m_packed) initDouble();
+        packDouble(realIn);
+        omxSP_FFTFwd_RToCCS_S32_Sfs(m_buf, m_packed, m_spec, m_order);
+        unpackDouble(realOut, imagOut);
+    }
+    
+    void forwardInterleaved(const double *R__ realIn, double *R__ complexOut) {
+        Profiler profiler("D_OPENMAX::forwardInterleaved [d]");
+        if (!m_packed) initDouble();
+        packDouble(realIn);
+        omxSP_FFTFwd_RToCCS_S32_Sfs(m_buf, m_packed, m_spec, m_order);
+        unpackDoubleInterleaved(complexOut);
+    }
+
+    void forwardPolar(const double *R__ realIn, double *R__ magOut, double *R__ phaseOut) {
+        Profiler profiler("D_OPENMAX::forwardPolar [d]");
+        if (!m_packed) initDouble();
+        packDouble(realIn);
+        omxSP_FFTFwd_RToCCS_S32_Sfs(m_buf, m_packed, m_spec, m_order);
+        unpackDouble(magOut, phaseOut); // temporarily
+        // at this point we actually have real/imag in the mag/phase arrays
+        const int hs = m_size/2;
+        for (int i = 0; i <= hs; ++i) {
+            double real = magOut[i];
+            double imag = phaseOut[i];
+            c_magphase(magOut + i, phaseOut + i, real, imag);
+        }
+    }
+
+    void forwardMagnitude(const double *R__ realIn, double *R__ magOut) {
+        Profiler profiler("D_OPENMAX::forwardMagnitude [d]");
+        if (!m_packed) initDouble();
+        packDouble(realIn);
+        omxSP_FFTFwd_RToCCS_S32_Sfs(m_buf, m_packed, m_spec, m_order);
+        const int hs = m_size/2;
+        for (int i = 0; i <= hs; ++i) {
+            int reali = i * 2;
+            int imagi = reali + 1;
+            double real = i2d(m_packed[reali]) * m_size;
+            double imag = i2d(m_packed[imagi]) * m_size;
+            magOut[i] = sqrt(real * real + imag * imag);
+        }
+    }
+
+    void forward(const float *R__ realIn, float *R__ realOut, float *R__ imagOut) {
+        Profiler profiler("D_OPENMAX::forward [f]");
+        if (!m_packed) initFloat();
+        packFloat(realIn);
+        omxSP_FFTFwd_RToCCS_S32_Sfs(m_buf, m_packed, m_spec, m_order);
+        unpackFloat(realOut, imagOut);
+    }
+
+    void forwardInterleaved(const float *R__ realIn, float *R__ complexOut) {
+        Profiler profiler("D_OPENMAX::forwardInterleaved [f]");
+        if (!m_packed) initFloat();
+        packFloat(realIn);
+        omxSP_FFTFwd_RToCCS_S32_Sfs(m_buf, m_packed, m_spec, m_order);
+        unpackFloatInterleaved(complexOut);
+    }
+
+    void forwardPolar(const float *R__ realIn, float *R__ magOut, float *R__ phaseOut) {
+        Profiler profiler("D_OPENMAX::forwardPolar [f]");
+        if (!m_packed) initFloat();
+
+        packFloat(realIn);
+        omxSP_FFTFwd_RToCCS_S32_Sfs(m_buf, m_packed, m_spec, m_order);
+        unpackFloat(magOut, phaseOut); // temporarily
+        // at this point we actually have real/imag in the mag/phase arrays
+        const int hs = m_size/2;
+        for (int i = 0; i <= hs; ++i) {
+            float real = magOut[i];
+            float imag = phaseOut[i];
+            c_magphase(magOut + i, phaseOut + i, real, imag);
+        }
+    }
+
+    void forwardMagnitude(const float *R__ realIn, float *R__ magOut) {
+        Profiler profiler("D_OPENMAX::forwardMagnitude [f]");
+        if (!m_packed) initFloat();
+        packFloat(realIn);
+        omxSP_FFTFwd_RToCCS_S32_Sfs(m_buf, m_packed, m_spec, m_order);
+        const int hs = m_size/2;
+        for (int i = 0; i <= hs; ++i) {
+            int reali = i * 2;
+            int imagi = reali + 1;
+            float real = i2f(m_packed[reali]) * m_size;
+            float imag = i2f(m_packed[imagi]) * m_size;
+            magOut[i] = sqrtf(real * real + imag * imag);
+        }
+    }
+
+    void inverse(const double *R__ realIn, const double *R__ imagIn, double *R__ realOut) {
+        Profiler profiler("D_OPENMAX::inverse [d]");
+        if (!m_packed) initDouble();
+        packDouble(realIn, imagIn);
+        omxSP_FFTInv_CCSToR_S32_Sfs(m_packed, m_buf, m_spec, 0);
+        unpackDouble(realOut);
+    }
+
+    void inverseInterleaved(const double *R__ complexIn, double *R__ realOut) {
+        Profiler profiler("D_OPENMAX::inverseInterleaved [d]");
+        if (!m_packed) initDouble();
+        convertDouble(complexIn);
+        omxSP_FFTInv_CCSToR_S32_Sfs(m_packed, m_buf, m_spec, 0);
+        unpackDouble(realOut);
+    }
+
+    void inversePolar(const double *R__ magIn, const double *R__ phaseIn, double *R__ realOut) {
+        Profiler profiler("D_OPENMAX::inversePolar [d]");
+        if (!m_packed) initDouble();
+        int index = 0;
+        const int hs = m_size/2;
+        for (int i = 0; i <= hs; ++i) {
+            double real, imag;
+            c_phasor(&real, &imag, phaseIn[i]);
+            m_fbuf[index++] = float(real);
+            m_fbuf[index++] = float(imag);
+        }
+        convertFloat(m_fbuf);
+        omxSP_FFTInv_CCSToR_S32_Sfs(m_packed, m_buf, m_spec, 0);
+        unpackDouble(realOut);
+    }
+
+    void inverseCepstral(const double *R__ magIn, double *R__ cepOut) {
+        Profiler profiler("D_OPENMAX::inverseCepstral [d]");
+        if (!m_packed) initDouble();
+        //!!! implement
+    }
+    
+    void inverse(const float *R__ realIn, const float *R__ imagIn, float *R__ realOut) {
+        Profiler profiler("D_OPENMAX::inverse [f]");
+        if (!m_packed) initFloat();
+        packFloat(realIn, imagIn);
+        omxSP_FFTInv_CCSToR_S32_Sfs(m_packed, m_buf, m_spec, 0);
+        unpackFloat(realOut);
+    }
+
+    void inverseInterleaved(const float *R__ complexIn, float *R__ realOut) {
+        Profiler profiler("D_OPENMAX::inverse [f]");
+        if (!m_packed) initFloat();
+        convertFloat(complexIn);
+        omxSP_FFTInv_CCSToR_S32_Sfs(m_packed, m_buf, m_spec, 0);
+        unpackFloat(realOut);
+    }
+
+    void inversePolar(const float *R__ magIn, const float *R__ phaseIn, float *R__ realOut) {
+        Profiler profiler("D_OPENMAX::inversePolar [f]");
+        if (!m_packed) initFloat();
+        const int hs = m_size/2;
+        v_polar_to_cartesian_interleaved(m_fbuf, magIn, phaseIn, hs+1);
+        convertFloat(m_fbuf);
+        omxSP_FFTInv_CCSToR_S32_Sfs(m_packed, m_buf, m_spec, 0);
+        unpackFloat(realOut);
+    }
+
+    void inverseCepstral(const float *R__ magIn, float *R__ cepOut) {
+        Profiler profiler("D_OPENMAX::inverseCepstral [f]");
+        if (!m_packed) initFloat();
+        //!!! implement
+    }
+
+private:
+    const int m_size;
+    int m_order;
+    OMX_S32 *m_packed;
+    OMX_S32 *m_buf;
+    float *m_fbuf;
+    OMXFFTSpec_R_S32 *m_spec;
+
+};
+
+#endif /* HAVE_OPENMAX */
 
 #ifdef HAVE_FFTW3
 
-// Define FFTW_DOUBLE_ONLY to make all uses of FFTW functions be
-// double-precision (so "float" FFTs are calculated by casting to
-// doubles and using the double-precision FFTW function).
-//
-// Define FFTW_FLOAT_ONLY to make all uses of FFTW functions be
-// single-precision (so "double" FFTs are calculated by casting to
-// floats and using the single-precision FFTW function).
-//
-// Neither of these flags is terribly desirable -- FFTW_FLOAT_ONLY
-// obviously loses you precision, and neither is handled in the most
-// efficient way so any performance improvement will be small at best.
-// The only real reason to define either flag would be to avoid
-// linking against both fftw3 and fftw3f libraries.
+/*
+ Define FFTW_DOUBLE_ONLY to make all uses of FFTW functions be
+ double-precision (so "float" FFTs are calculated by casting to
+ doubles and using the double-precision FFTW function).
+
+ Define FFTW_SINGLE_ONLY to make all uses of FFTW functions be
+ single-precision (so "double" FFTs are calculated by casting to
+ floats and using the single-precision FFTW function).
+
+ Neither of these flags is desirable for either performance or
+ precision. The main reason to define either flag is to avoid linking
+ against both fftw3 and fftw3f libraries.
+*/
 
 //#define FFTW_DOUBLE_ONLY 1
-//#define FFTW_FLOAT_ONLY 1
+//#define FFTW_SINGLE_ONLY 1
 
-#if defined(FFTW_DOUBLE_ONLY) && defined(FFTW_FLOAT_ONLY)
+#if defined(FFTW_DOUBLE_ONLY) && defined(FFTW_SINGLE_ONLY)
 // Can't meaningfully define both
-#undef FFTW_DOUBLE_ONLY
-#undef FFTW_FLOAT_ONLY
+#error Can only define one of FFTW_DOUBLE_ONLY and FFTW_SINGLE_ONLY
+#endif
+
+#if defined(FFTW_FLOAT_ONLY)
+#warning FFTW_FLOAT_ONLY is deprecated, use FFTW_SINGLE_ONLY instead
+#define FFTW_SINGLE_ONLY 1
 #endif
 
 #ifdef FFTW_DOUBLE_ONLY
@@ -129,7 +1537,7 @@ namespace FFTs {
 #define fft_float_type float
 #endif /* FFTW_DOUBLE_ONLY */
 
-#ifdef FFTW_FLOAT_ONLY
+#ifdef FFTW_SINGLE_ONLY
 #define fft_double_type float
 #define fftw_complex fftwf_complex
 #define fftw_plan fftwf_plan
@@ -145,7 +1553,7 @@ namespace FFTs {
 #define sin sinf
 #else
 #define fft_double_type double
-#endif /* FFTW_FLOAT_ONLY */
+#endif /* FFTW_SINGLE_ONLY */
 
 class D_FFTW : public FFTImpl
 {
@@ -157,7 +1565,9 @@ public:
 
     ~D_FFTW() {
         if (m_fplanf) {
+#ifndef NO_THREADING
             m_commonMutex.lock();
+#endif
             bool save = false;
             if (m_extantf > 0 && --m_extantf == 0) save = true;
 #ifndef FFTW_DOUBLE_ONLY
@@ -167,27 +1577,48 @@ public:
             fftwf_destroy_plan(m_fplani);
             fftwf_free(m_fbuf);
             fftwf_free(m_fpacked);
+#ifndef NO_THREADING
             m_commonMutex.unlock();
+#endif
         }
         if (m_dplanf) {
+#ifndef NO_THREADING
             m_commonMutex.lock();
+#endif
             bool save = false;
             if (m_extantd > 0 && --m_extantd == 0) save = true;
-#ifndef FFTW_FLOAT_ONLY
+#ifndef FFTW_SINGLE_ONLY
             if (save) saveWisdom('d');
 #endif
             fftw_destroy_plan(m_dplanf);
             fftw_destroy_plan(m_dplani);
             fftw_free(m_dbuf);
             fftw_free(m_dpacked);
+#ifndef NO_THREADING
             m_commonMutex.unlock();
+#endif
         }
     }
 
+    FFT::Precisions
+    getSupportedPrecisions() const {
+#ifdef FFTW_SINGLE_ONLY
+        return FFT::SinglePrecision;
+#else
+#ifdef FFTW_DOUBLE_ONLY
+        return FFT::DoublePrecision;
+#else
+        return FFT::SinglePrecision | FFT::DoublePrecision;
+#endif
+#endif
+    }
+
     void initFloat() {
         if (m_fplanf) return;
         bool load = false;
+#ifndef NO_THREADING
         m_commonMutex.lock();
+#endif
         if (m_extantf++ == 0) load = true;
 #ifdef FFTW_DOUBLE_ONLY
         if (load) loadWisdom('d');
@@ -201,15 +1632,19 @@ public:
             (m_size, m_fbuf, m_fpacked, FFTW_MEASURE);
         m_fplani = fftwf_plan_dft_c2r_1d
             (m_size, m_fpacked, m_fbuf, FFTW_MEASURE);
+#ifndef NO_THREADING
         m_commonMutex.unlock();
+#endif
     }
 
     void initDouble() {
         if (m_dplanf) return;
         bool load = false;
+#ifndef NO_THREADING
         m_commonMutex.lock();
+#endif
         if (m_extantd++ == 0) load = true;
-#ifdef FFTW_FLOAT_ONLY
+#ifdef FFTW_SINGLE_ONLY
         if (load) loadWisdom('f');
 #else
         if (load) loadWisdom('d');
@@ -221,7 +1656,9 @@ public:
             (m_size, m_dbuf, m_dpacked, FFTW_MEASURE);
         m_dplani = fftw_plan_dft_c2r_1d
             (m_size, m_dpacked, m_dbuf, FFTW_MEASURE);
+#ifndef NO_THREADING
         m_commonMutex.unlock();
+#endif
     }
 
     void loadWisdom(char type) { wisdom(false, type); }
@@ -232,7 +1669,7 @@ public:
 #ifdef FFTW_DOUBLE_ONLY
         if (type == 'f') return;
 #endif
-#ifdef FFTW_FLOAT_ONLY
+#ifdef FFTW_SINGLE_ONLY
         if (type == 'd') return;
 #endif
 
@@ -252,7 +1689,7 @@ public:
 #else
             case 'f': fftwf_export_wisdom_to_file(f); break;
 #endif
-#ifdef FFTW_FLOAT_ONLY
+#ifdef FFTW_SINGLE_ONLY
             case 'd': break;
 #else
             case 'd': fftw_export_wisdom_to_file(f); break;
@@ -266,7 +1703,7 @@ public:
 #else
             case 'f': fftwf_import_wisdom_from_file(f); break;
 #endif
-#ifdef FFTW_FLOAT_ONLY
+#ifdef FFTW_SINGLE_ONLY
             case 'd': break;
 #else
             case 'd': fftw_import_wisdom_from_file(f); break;
@@ -340,7 +1777,7 @@ public:
         if (!m_dplanf) initDouble();
         const int sz = m_size;
         fft_double_type *const R__ dbuf = m_dbuf;
-#ifndef FFTW_FLOAT_ONLY
+#ifndef FFTW_SINGLE_ONLY
         if (realIn != dbuf) 
 #endif
             for (int i = 0; i < sz; ++i) {
@@ -354,7 +1791,7 @@ public:
         if (!m_dplanf) initDouble();
         const int sz = m_size;
         fft_double_type *const R__ dbuf = m_dbuf;
-#ifndef FFTW_FLOAT_ONLY
+#ifndef FFTW_SINGLE_ONLY
         if (realIn != dbuf) 
 #endif
             for (int i = 0; i < sz; ++i) {
@@ -368,7 +1805,7 @@ public:
         if (!m_dplanf) initDouble();
         fft_double_type *const R__ dbuf = m_dbuf;
         const int sz = m_size;
-#ifndef FFTW_FLOAT_ONLY
+#ifndef FFTW_SINGLE_ONLY
         if (realIn != dbuf)
 #endif
             for (int i = 0; i < sz; ++i) {
@@ -383,7 +1820,7 @@ public:
         if (!m_dplanf) initDouble();
         fft_double_type *const R__ dbuf = m_dbuf;
         const int sz = m_size;
-#ifndef FFTW_FLOAT_ONLY
+#ifndef FFTW_SINGLE_ONLY
         if (realIn != m_dbuf)
 #endif
             for (int i = 0; i < sz; ++i) {
@@ -464,7 +1901,7 @@ public:
         fftw_execute(m_dplani);
         const int sz = m_size;
         fft_double_type *const R__ dbuf = m_dbuf;
-#ifndef FFTW_FLOAT_ONLY
+#ifndef FFTW_SINGLE_ONLY
         if (realOut != dbuf) 
 #endif
             for (int i = 0; i < sz; ++i) {
@@ -474,11 +1911,11 @@ public:
 
     void inverseInterleaved(const double *R__ complexIn, double *R__ realOut) {
         if (!m_dplanf) initDouble();
-        v_copy((double *)m_dpacked, complexIn, m_size + 2);
+        v_convert((double *)m_dpacked, complexIn, m_size + 2);
         fftw_execute(m_dplani);
         const int sz = m_size;
         fft_double_type *const R__ dbuf = m_dbuf;
-#ifndef FFTW_FLOAT_ONLY
+#ifndef FFTW_SINGLE_ONLY
         if (realOut != dbuf) 
 #endif
             for (int i = 0; i < sz; ++i) {
@@ -499,7 +1936,7 @@ public:
         fftw_execute(m_dplani);
         const int sz = m_size;
         fft_double_type *const R__ dbuf = m_dbuf;
-#ifndef FFTW_FLOAT_ONLY
+#ifndef FFTW_SINGLE_ONLY
         if (realOut != dbuf)
 #endif
             for (int i = 0; i < sz; ++i) {
@@ -520,7 +1957,7 @@ public:
         }
         fftw_execute(m_dplani);
         const int sz = m_size;
-#ifndef FFTW_FLOAT_ONLY
+#ifndef FFTW_SINGLE_ONLY
         if (cepOut != dbuf)
 #endif
             for (int i = 0; i < sz; ++i) {
@@ -609,7 +2046,7 @@ private:
     fftwf_complex *m_fpacked;
     fftw_plan m_dplanf;
     fftw_plan m_dplani;
-#ifdef FFTW_FLOAT_ONLY
+#ifdef FFTW_SINGLE_ONLY
     float *m_dbuf;
 #else
     double *m_dbuf;
@@ -618,7 +2055,9 @@ private:
     const int m_size;
     static int m_extantf;
     static int m_extantd;
+#ifndef NO_THREADING
     static Mutex m_commonMutex;
+#endif
 };
 
 int
@@ -627,11 +2066,356 @@ D_FFTW::m_extantf = 0;
 int
 D_FFTW::m_extantd = 0;
 
+#ifndef NO_THREADING
 Mutex
 D_FFTW::m_commonMutex;
+#endif
 
 #endif /* HAVE_FFTW3 */
 
+#ifdef HAVE_SFFT
+
+/*
+ Define SFFT_DOUBLE_ONLY to make all uses of SFFT functions be
+ double-precision (so "float" FFTs are calculated by casting to
+ doubles and using the double-precision SFFT function).
+
+ Define SFFT_SINGLE_ONLY to make all uses of SFFT functions be
+ single-precision (so "double" FFTs are calculated by casting to
+ floats and using the single-precision SFFT function).
+
+ Neither of these flags is desirable for either performance or
+ precision.
+*/
+
+//#define SFFT_DOUBLE_ONLY 1
+//#define SFFT_SINGLE_ONLY 1
+
+#if defined(SFFT_DOUBLE_ONLY) && defined(SFFT_SINGLE_ONLY)
+// Can't meaningfully define both
+#error Can only define one of SFFT_DOUBLE_ONLY and SFFT_SINGLE_ONLY
+#endif
+
+#ifdef SFFT_DOUBLE_ONLY
+#define fft_float_type double
+#define FLAG_SFFT_FLOAT SFFT_DOUBLE
+#define atan2f atan2
+#define sqrtf sqrt
+#define cosf cos
+#define sinf sin
+#define logf log
+#else
+#define FLAG_SFFT_FLOAT SFFT_FLOAT
+#define fft_float_type float
+#endif /* SFFT_DOUBLE_ONLY */
+
+#ifdef SFFT_SINGLE_ONLY
+#define fft_double_type float
+#define FLAG_SFFT_DOUBLE SFFT_FLOAT
+#define atan2 atan2f
+#define sqrt sqrtf
+#define cos cosf
+#define sin sinf
+#define log logf
+#else
+#define FLAG_SFFT_DOUBLE SFFT_DOUBLE
+#define fft_double_type double
+#endif /* SFFT_SINGLE_ONLY */
+
+class D_SFFT : public FFTImpl
+{
+public:
+    D_SFFT(int size) :
+        m_fplanf(0), m_fplani(0), m_dplanf(0), m_dplani(0), m_size(size)
+    {
+    }
+
+    ~D_SFFT() {
+        if (m_fplanf) {
+            sfft_free(m_fplanf);
+            sfft_free(m_fplani);
+            deallocate(m_fbuf);
+            deallocate(m_fresult);
+        }
+        if (m_dplanf) {
+            sfft_free(m_dplanf);
+            sfft_free(m_dplani);
+            deallocate(m_dbuf);
+            deallocate(m_dresult);
+        }
+    }
+
+    FFT::Precisions
+    getSupportedPrecisions() const {
+#ifdef SFFT_SINGLE_ONLY
+        return FFT::SinglePrecision;
+#else
+#ifdef SFFT_DOUBLE_ONLY
+        return FFT::DoublePrecision;
+#else
+        return FFT::SinglePrecision | FFT::DoublePrecision;
+#endif
+#endif
+    }
+
+    void initFloat() {
+        if (m_fplanf) return;
+        m_fbuf = allocate<fft_float_type>(2 * m_size);
+        m_fresult = allocate<fft_float_type>(2 * m_size);
+        m_fplanf = sfft_init(m_size, SFFT_FORWARD | FLAG_SFFT_FLOAT);
+        m_fplani = sfft_init(m_size, SFFT_BACKWARD | FLAG_SFFT_FLOAT);
+        if (!m_fplanf || !m_fplani) {
+            if (!m_fplanf) {
+                std::cerr << "D_SFFT: Failed to construct forward float transform for size " << m_size << " (check SFFT library's target configuration)" << std::endl;
+            } else {
+                std::cerr << "D_SFFT: Failed to construct inverse float transform for size " << m_size << " (check SFFT library's target configuration)" << std::endl;
+            }
+#ifndef NO_EXCEPTIONS
+            throw FFT::InternalError;
+#else
+            abort();
+#endif
+        }
+    }
+
+    void initDouble() {
+        if (m_dplanf) return;
+        m_dbuf = allocate<fft_double_type>(2 * m_size);
+        m_dresult = allocate<fft_double_type>(2 * m_size);
+        m_dplanf = sfft_init(m_size, SFFT_FORWARD | FLAG_SFFT_DOUBLE);
+        m_dplani = sfft_init(m_size, SFFT_BACKWARD | FLAG_SFFT_DOUBLE);
+        if (!m_dplanf || !m_dplani) {
+            if (!m_dplanf) {
+                std::cerr << "D_SFFT: Failed to construct forward double transform for size " << m_size << " (check SFFT library's target configuration)" << std::endl;
+            } else {
+                std::cerr << "D_SFFT: Failed to construct inverse double transform for size " << m_size << " (check SFFT library's target configuration)" << std::endl;
+            }
+#ifndef NO_EXCEPTIONS
+            throw FFT::InternalError;
+#else
+            abort();
+#endif
+        }
+    }
+
+    void packFloat(const float *R__ re, const float *R__ im, fft_float_type *target, int n) {
+        for (int i = 0; i < n; ++i) target[i*2] = re[i];
+        if (im) {
+            for (int i = 0; i < n; ++i) target[i*2+1] = im[i]; 
+        } else {
+            for (int i = 0; i < n; ++i) target[i*2+1] = 0.f;
+        }                
+    }
+
+    void packDouble(const double *R__ re, const double *R__ im, fft_double_type *target, int n) {
+        for (int i = 0; i < n; ++i) target[i*2] = re[i];
+        if (im) {
+            for (int i = 0; i < n; ++i) target[i*2+1] = im[i];
+        } else {
+            for (int i = 0; i < n; ++i) target[i*2+1] = 0.0;
+        }                
+    }
+
+    void unpackFloat(const fft_float_type *source, float *R__ re, float *R__ im, int n) {
+        for (int i = 0; i < n; ++i) re[i] = source[i*2];
+        if (im) {
+            for (int i = 0; i < n; ++i) im[i] = source[i*2+1];
+        }
+    }        
+
+    void unpackDouble(const fft_double_type *source, double *R__ re, double *R__ im, int n) {
+        for (int i = 0; i < n; ++i) re[i] = source[i*2];
+        if (im) {
+            for (int i = 0; i < n; ++i) im[i] = source[i*2+1];
+        }
+    }        
+
+    template<typename T>
+    void mirror(T *R__ cplx, int n) {
+        for (int i = 1; i <= n/2; ++i) {
+            int j = n-i;
+            cplx[j*2] = cplx[i*2];
+            cplx[j*2+1] = -cplx[i*2+1];
+        }
+    }
+
+    void forward(const double *R__ realIn, double *R__ realOut, double *R__ imagOut) {
+        if (!m_dplanf) initDouble();
+        packDouble(realIn, 0, m_dbuf, m_size);
+        sfft_execute(m_dplanf, m_dbuf, m_dresult);
+        unpackDouble(m_dresult, realOut, imagOut, m_size/2+1);
+    }
+
+    void forwardInterleaved(const double *R__ realIn, double *R__ complexOut) {
+        if (!m_dplanf) initDouble();
+        packDouble(realIn, 0, m_dbuf, m_size);
+        sfft_execute(m_dplanf, m_dbuf, m_dresult);
+        v_convert(complexOut, m_dresult, m_size+2); // i.e. m_size/2+1 complex
+    }
+
+    void forwardPolar(const double *R__ realIn, double *R__ magOut, double *R__ phaseOut) {
+        if (!m_dplanf) initDouble();
+        packDouble(realIn, 0, m_dbuf, m_size);
+        sfft_execute(m_dplanf, m_dbuf, m_dresult);
+        v_cartesian_interleaved_to_polar(magOut, phaseOut,
+                                         m_dresult, m_size/2+1);
+    }
+
+    void forwardMagnitude(const double *R__ realIn, double *R__ magOut) {
+        if (!m_dplanf) initDouble();
+        packDouble(realIn, 0, m_dbuf, m_size);
+        sfft_execute(m_dplanf, m_dbuf, m_dresult);
+        const int hs = m_size/2;
+        for (int i = 0; i <= hs; ++i) {
+            magOut[i] = sqrt(m_dresult[i*2] * m_dresult[i*2] +
+                             m_dresult[i*2+1] * m_dresult[i*2+1]);
+        }
+    }
+
+    void forward(const float *R__ realIn, float *R__ realOut, float *R__ imagOut) {
+        if (!m_fplanf) initFloat();
+        packFloat(realIn, 0, m_fbuf, m_size);
+        sfft_execute(m_fplanf, m_fbuf, m_fresult);
+        unpackFloat(m_fresult, realOut, imagOut, m_size/2+1);
+    }
+
+    void forwardInterleaved(const float *R__ realIn, float *R__ complexOut) {
+        if (!m_fplanf) initFloat();
+        packFloat(realIn, 0, m_fbuf, m_size);
+        sfft_execute(m_fplanf, m_fbuf, m_fresult);
+        v_convert(complexOut, m_fresult, m_size+2); // i.e. m_size/2+1 complex
+    }
+
+    void forwardPolar(const float *R__ realIn, float *R__ magOut, float *R__ phaseOut) {
+        if (!m_fplanf) initFloat();
+        packFloat(realIn, 0, m_fbuf, m_size);
+        sfft_execute(m_fplanf, m_fbuf, m_fresult);
+        v_cartesian_interleaved_to_polar(magOut, phaseOut,
+                                         m_fresult, m_size/2+1);
+    }
+
+    void forwardMagnitude(const float *R__ realIn, float *R__ magOut) {
+        if (!m_fplanf) initFloat();
+        packFloat(realIn, 0, m_fbuf, m_size);
+        sfft_execute(m_fplanf, m_fbuf, m_fresult);
+        const int hs = m_size/2;
+        for (int i = 0; i <= hs; ++i) {
+            magOut[i] = sqrtf(m_fresult[i*2] * m_fresult[i*2] +
+                              m_fresult[i*2+1] * m_fresult[i*2+1]);
+        }
+    }
+
+    void inverse(const double *R__ realIn, const double *R__ imagIn, double *R__ realOut) {
+        if (!m_dplanf) initDouble();
+        packDouble(realIn, imagIn, m_dbuf, m_size/2+1);
+        mirror(m_dbuf, m_size);
+        sfft_execute(m_dplani, m_dbuf, m_dresult);
+        for (int i = 0; i < m_size; ++i) {
+            realOut[i] = m_dresult[i*2];
+        }
+    }
+
+    void inverseInterleaved(const double *R__ complexIn, double *R__ realOut) {
+        if (!m_dplanf) initDouble();
+        v_convert((double *)m_dbuf, complexIn, m_size + 2);
+        mirror(m_dbuf, m_size);
+        sfft_execute(m_dplani, m_dbuf, m_dresult);
+        for (int i = 0; i < m_size; ++i) {
+            realOut[i] = m_dresult[i*2];
+        }
+    }
+
+    void inversePolar(const double *R__ magIn, const double *R__ phaseIn, double *R__ realOut) {
+        if (!m_dplanf) initDouble();
+        const int hs = m_size/2;
+        for (int i = 0; i <= hs; ++i) {
+            m_dbuf[i*2] = magIn[i] * cos(phaseIn[i]);
+            m_dbuf[i*2+1] = magIn[i] * sin(phaseIn[i]);
+        }
+        mirror(m_dbuf, m_size);
+        sfft_execute(m_dplani, m_dbuf, m_dresult);
+        for (int i = 0; i < m_size; ++i) {
+            realOut[i] = m_dresult[i*2];
+        }
+    }
+
+    void inverseCepstral(const double *R__ magIn, double *R__ cepOut) {
+        if (!m_dplanf) initDouble();
+        const int hs = m_size/2;
+        for (int i = 0; i <= hs; ++i) {
+            m_dbuf[i*2] = log(magIn[i] + 0.000001);
+            m_dbuf[i*2+1] = 0.0;
+        }
+        mirror(m_dbuf, m_size);
+        sfft_execute(m_dplani, m_dbuf, m_dresult);
+        for (int i = 0; i < m_size; ++i) {
+            cepOut[i] = m_dresult[i*2];
+        }
+    }
+
+    void inverse(const float *R__ realIn, const float *R__ imagIn, float *R__ realOut) {
+        if (!m_fplanf) initFloat();
+        packFloat(realIn, imagIn, m_fbuf, m_size/2+1);
+        mirror(m_fbuf, m_size);
+        sfft_execute(m_fplani, m_fbuf, m_fresult);
+        for (int i = 0; i < m_size; ++i) {
+            realOut[i] = m_fresult[i*2];
+        }
+    }
+
+    void inverseInterleaved(const float *R__ complexIn, float *R__ realOut) {
+        if (!m_fplanf) initFloat();
+        v_convert((float *)m_fbuf, complexIn, m_size + 2);
+        mirror(m_fbuf, m_size);
+        sfft_execute(m_fplani, m_fbuf, m_fresult);
+        for (int i = 0; i < m_size; ++i) {
+            realOut[i] = m_fresult[i*2];
+        }
+    }
+
+    void inversePolar(const float *R__ magIn, const float *R__ phaseIn, float *R__ realOut) {
+        if (!m_fplanf) initFloat();
+        const int hs = m_size/2;
+        for (int i = 0; i <= hs; ++i) {
+            m_fbuf[i*2] = magIn[i] * cosf(phaseIn[i]);
+            m_fbuf[i*2+1] = magIn[i] * sinf(phaseIn[i]);
+        }
+        mirror(m_fbuf, m_size);
+        sfft_execute(m_fplani, m_fbuf, m_fresult);
+        for (int i = 0; i < m_size; ++i) {
+            realOut[i] = m_fresult[i*2];
+        }
+    }
+
+    void inverseCepstral(const float *R__ magIn, float *R__ cepOut) {
+        if (!m_fplanf) initFloat();
+        const int hs = m_size/2;
+        for (int i = 0; i <= hs; ++i) {
+            m_fbuf[i*2] = logf(magIn[i] + 0.00001);
+            m_fbuf[i*2+1] = 0.0f;
+        }
+        sfft_execute(m_fplani, m_fbuf, m_fresult);
+        for (int i = 0; i < m_size; ++i) {
+            cepOut[i] = m_fresult[i*2];
+        }
+    }
+
+private:
+    sfft_plan_t *m_fplanf;
+    sfft_plan_t *m_fplani;
+    fft_float_type *m_fbuf;
+    fft_float_type *m_fresult;
+
+    sfft_plan_t *m_dplanf;
+    sfft_plan_t *m_dplani;
+    fft_double_type *m_dbuf;
+    fft_double_type *m_dresult;
+
+    const int m_size;
+};
+
+#endif /* HAVE_SFFT */
+
 #ifdef USE_KISSFFT
 
 class D_KISSFFT : public FFTImpl
@@ -665,6 +2449,11 @@ public:
         delete[] m_fpacked;
     }
 
+    FFT::Precisions
+    getSupportedPrecisions() const {
+        return FFT::SinglePrecision;
+    }
+
     void initFloat() { }
     void initDouble() { }
 
@@ -958,6 +2747,11 @@ public:
         delete[] m_d;
     }
 
+    FFT::Precisions
+    getSupportedPrecisions() const {
+        return FFT::DoublePrecision;
+    }
+
     void initFloat() { }
     void initDouble() { }
 
@@ -1257,114 +3051,130 @@ D_Cross::basefft(bool inverse, const double *R__ ri, const double *R__ ii, doubl
 
 } /* end namespace FFTs */
 
-int
-FFT::m_method = -1;
+std::string
+FFT::m_implementation;
 
-FFT::FFT(int size, int debugLevel)
+std::set<std::string>
+FFT::getImplementations()
+{
+    std::set<std::string> impls;
+#ifdef HAVE_IPP
+    impls.insert("ipp");
+#endif
+#ifdef HAVE_FFTW3
+    impls.insert("fftw");
+#endif
+#ifdef USE_KISSFFT
+    impls.insert("kissfft");
+#endif
+#ifdef HAVE_VDSP
+    impls.insert("vdsp");
+#endif
+#ifdef HAVE_MEDIALIB
+    impls.insert("medialib");
+#endif
+#ifdef HAVE_OPENMAX
+    impls.insert("openmax");
+#endif
+#ifdef HAVE_SFFT
+    impls.insert("sfft");
+#endif
+#ifdef USE_BUILTIN_FFT
+    impls.insert("cross");
+#endif
+    return impls;
+}
+
+void
+FFT::pickDefaultImplementation()
+{
+    if (m_implementation != "") return;
+
+    std::set<std::string> impls = getImplementations();
+
+    std::string best = "cross";
+    if (impls.find("kissfft") != impls.end()) best = "kissfft";
+    if (impls.find("medialib") != impls.end()) best = "medialib";
+    if (impls.find("openmax") != impls.end()) best = "openmax";
+    if (impls.find("sfft") != impls.end()) best = "sfft";
+    if (impls.find("fftw") != impls.end()) best = "fftw";
+    if (impls.find("vdsp") != impls.end()) best = "vdsp";
+    if (impls.find("ipp") != impls.end()) best = "ipp";
+    
+    m_implementation = best;
+}
+
+std::string
+FFT::getDefaultImplementation()
+{
+    return m_implementation;
+}
+
+void
+FFT::setDefaultImplementation(std::string i)
+{
+    m_implementation = i;
+}
+
+FFT::FFT(int size, int debugLevel) :
+    d(0)
 {
     if ((size < 2) ||
         (size & (size-1))) {
         std::cerr << "FFT::FFT(" << size << "): power-of-two sizes only supported, minimum size 2" << std::endl;
+#ifndef NO_EXCEPTIONS
         throw InvalidSize;
-    }
-
-    if (m_method == -1) {
-        m_method = 3;
-#ifdef USE_KISSFFT
-        m_method = 2;
-#endif
-#ifdef HAVE_FFTW3
-        m_method = 1;
-#endif
-    }
-
-    switch (m_method) {
-
-    case 0:
-        std::cerr << "FFT::FFT(" << size << "): WARNING: Selected implemention not available" << std::endl;
-#ifdef USE_BUILTIN_FFT
-        d = new FFTs::D_Cross(size);
 #else
-        std::cerr << "FFT::FFT(" << size << "): ERROR: Fallback implementation not available!" << std::endl;
         abort();
 #endif
-        break;
+    }
 
-    case 1:
+    if (m_implementation == "") pickDefaultImplementation();
+    std::string impl = m_implementation;
+
+    if (debugLevel > 0) {
+        std::cerr << "FFT::FFT(" << size << "): using implementation: "
+                  << impl << std::endl;
+    }
+
+    if (impl == "ipp") {
+#ifdef HAVE_IPP
+        d = new FFTs::D_IPP(size);
+#endif
+    } else if (impl == "fftw") {
 #ifdef HAVE_FFTW3
-        if (debugLevel > 0) {
-            std::cerr << "FFT::FFT(" << size << "): using FFTW3 implementation"
-                      << std::endl;
-        }
         d = new FFTs::D_FFTW(size);
-#else
-        std::cerr << "FFT::FFT(" << size << "): WARNING: Selected implemention not available" << std::endl;
-#ifdef USE_BUILTIN_FFT
-        d = new FFTs::D_Cross(size);
-#else
-        std::cerr << "FFT::FFT(" << size << "): ERROR: Fallback implementation not available!" << std::endl;
-        abort();
 #endif
-#endif
-        break;
-
-    case 2:
+    } else if (impl == "kissfft") {        
 #ifdef USE_KISSFFT
-        if (debugLevel > 0) {
-            std::cerr << "FFT::FFT(" << size << "): using KISSFFT implementation"
-                      << std::endl;
-        }
         d = new FFTs::D_KISSFFT(size);
-#else
-        std::cerr << "FFT::FFT(" << size << "): WARNING: Selected implemention not available" << std::endl;
+#endif
+    } else if (impl == "vdsp") {
+#ifdef HAVE_VDSP
+        d = new FFTs::D_VDSP(size);
+#endif
+    } else if (impl == "medialib") {
+#ifdef HAVE_MEDIALIB
+        d = new FFTs::D_MEDIALIB(size);
+#endif
+    } else if (impl == "openmax") {
+#ifdef HAVE_OPENMAX
+        d = new FFTs::D_OPENMAX(size);
+#endif
+    } else if (impl == "sfft") {
+#ifdef HAVE_SFFT
+        d = new FFTs::D_SFFT(size);
+#endif
+    } else if (impl == "cross") {
 #ifdef USE_BUILTIN_FFT
         d = new FFTs::D_Cross(size);
-#else
-        std::cerr << "FFT::FFT(" << size << "): ERROR: Fallback implementation not available!" << std::endl;
-        abort();
 #endif
-#endif
-        break;
+    }
 
-    case 4:
-        std::cerr << "FFT::FFT(" << size << "): WARNING: Selected implemention not available" << std::endl;
-#ifdef USE_BUILTIN_FFT
-        d = new FFTs::D_Cross(size);
-#else
-        std::cerr << "FFT::FFT(" << size << "): ERROR: Fallback implementation not available!" << std::endl;
-        abort();
-#endif
-        break;
-
-    case 5:
-        std::cerr << "FFT::FFT(" << size << "): WARNING: Selected implemention not available" << std::endl;
-#ifdef USE_BUILTIN_FFT
-        d = new FFTs::D_Cross(size);
-#else
-        std::cerr << "FFT::FFT(" << size << "): ERROR: Fallback implementation not available!" << std::endl;
-        abort();
-#endif
-        break;
-
-    case 6:
-        std::cerr << "FFT::FFT(" << size << "): WARNING: Selected implemention not available" << std::endl;
-#ifdef USE_BUILTIN_FFT
-        d = new FFTs::D_Cross(size);
-#else
-        std::cerr << "FFT::FFT(" << size << "): ERROR: Fallback implementation not available!" << std::endl;
-        abort();
-#endif
-        break;
-
-    default:
-#ifdef USE_BUILTIN_FFT
-        std::cerr << "FFT::FFT(" << size << "): WARNING: using slow built-in implementation" << std::endl;
-        d = new FFTs::D_Cross(size);
-#else
-        std::cerr << "FFT::FFT(" << size << "): ERROR: Fallback implementation not available!" << std::endl;
-        abort();
-#endif
-        break;
+    if (!d) {
+        std::cerr << "FFT::FFT(" << size << "): ERROR: implementation "
+                  << impl << " is not compiled in" << std::endl;
+        throw InvalidImplementation;
     }
 }
 
@@ -1481,12 +3291,288 @@ FFT::initDouble()
     d->initDouble();
 }
 
+FFT::Precisions
+FFT::getSupportedPrecisions() const
+{
+    return d->getSupportedPrecisions();
+}
 
-void
+#ifdef FFT_MEASUREMENT
+
+std::string
 FFT::tune()
 {
-    std::cerr << "FFT::tune: Measurement not enabled" << std::endl;
+    std::ostringstream os;
+    os << "FFT::tune()..." << std::endl;
+
+    std::vector<int> sizes;
+    std::map<FFTImpl *, int> candidates;
+    std::map<int, int> wins;
+
+    sizes.push_back(512);
+    sizes.push_back(1024);
+    sizes.push_back(4096);
+    
+    for (unsigned int si = 0; si < sizes.size(); ++si) {
+
+        int size = sizes[si];
+
+        while (!candidates.empty()) {
+            delete candidates.begin()->first;
+            candidates.erase(candidates.begin());
+        }
+
+        FFTImpl *d;
+        
+#ifdef HAVE_IPP
+        std::cerr << "Constructing new IPP FFT object for size " << size << "..." << std::endl;
+        d = new FFTs::D_IPP(size);
+        d->initFloat();
+        d->initDouble();
+        candidates[d] = 0;
+#endif
+        
+#ifdef HAVE_FFTW3
+        os << "Constructing new FFTW3 FFT object for size " << size << "..." << std::endl;
+        d = new FFTs::D_FFTW(size);
+        d->initFloat();
+        d->initDouble();
+        candidates[d] = 1;
+#endif
+
+#ifdef USE_KISSFFT
+        os << "Constructing new KISSFFT object for size " << size << "..." << std::endl;
+        d = new FFTs::D_KISSFFT(size);
+        d->initFloat();
+        d->initDouble();
+        candidates[d] = 2;
+#endif        
+
+#ifdef USE_BUILTIN_FFT
+        os << "Constructing new Cross FFT object for size " << size << "..." << std::endl;
+        d = new FFTs::D_Cross(size);
+        d->initFloat();
+        d->initDouble();
+        candidates[d] = 3;
+#endif
+        
+#ifdef HAVE_VDSP
+        os << "Constructing new vDSP FFT object for size " << size << "..." << std::endl;
+        d = new FFTs::D_VDSP(size);
+        d->initFloat();
+        d->initDouble();
+        candidates[d] = 4;
+#endif
+        
+#ifdef HAVE_MEDIALIB
+        std::cerr << "Constructing new MediaLib FFT object for size " << size << "..." << std::endl;
+        d = new FFTs::D_MEDIALIB(size);
+        d->initFloat();
+        d->initDouble();
+        candidates[d] = 5;
+#endif
+        
+#ifdef HAVE_OPENMAX
+        os << "Constructing new OpenMAX FFT object for size " << size << "..." << std::endl;
+        d = new FFTs::D_OPENMAX(size);
+        d->initFloat();
+        d->initDouble();
+        candidates[d] = 6;
+#endif
+        
+#ifdef HAVE_SFFT
+        os << "Constructing new SFFT FFT object for size " << size << "..." << std::endl;
+        d = new FFTs::D_SFFT(size);
+//        d->initFloat();
+        d->initDouble();
+        candidates[d] = 6;
+#endif
+
+        os << "CLOCKS_PER_SEC = " << CLOCKS_PER_SEC << std::endl;
+        float divisor = float(CLOCKS_PER_SEC) / 1000.f;
+        
+        os << "Timing order is: ";
+        for (std::map<FFTImpl *, int>::iterator ci = candidates.begin();
+             ci != candidates.end(); ++ci) {
+            os << ci->second << " ";
+        }
+        os << std::endl;
+
+        int iterations = 500;
+        os << "Iterations: " << iterations << std::endl;
+
+        double *da = new double[size];
+        double *db = new double[size];
+        double *dc = new double[size];
+        double *dd = new double[size];
+        double *di = new double[size + 2];
+        double *dj = new double[size + 2];
+
+        float *fa = new float[size];
+        float *fb = new float[size];
+        float *fc = new float[size];
+        float *fd = new float[size];
+        float *fi = new float[size + 2];
+        float *fj = new float[size + 2];
+
+        for (int type = 0; type < 16; ++type) {
+    
+            //!!!
+            if ((type > 3 && type < 8) ||
+                (type > 11)) {
+                continue;
+            }
+
+            if (type > 7) {
+                // inverse transform: bigger inputs, to simulate the
+                // fact that the forward transform is unscaled
+                for (int i = 0; i < size; ++i) {
+                    da[i] = drand48() * size;
+                    fa[i] = da[i];
+                    db[i] = drand48() * size;
+                    fb[i] = db[i];
+                }
+            } else {    
+                for (int i = 0; i < size; ++i) {
+                    da[i] = drand48();
+                    fa[i] = da[i];
+                    db[i] = drand48();
+                    fb[i] = db[i];
+                }
+            }
+                
+            for (int i = 0; i < size + 2; ++i) {
+                di[i] = drand48();
+                fi[i] = di[i];
+            }
+
+            int low = -1;
+            int lowscore = 0;
+
+            const char *names[] = {
+
+                "Forward Cartesian Double",
+                "Forward Interleaved Double",
+                "Forward Polar Double",
+                "Forward Magnitude Double",
+                "Forward Cartesian Float",
+                "Forward Interleaved Float",
+                "Forward Polar Float",
+                "Forward Magnitude Float",
+
+                "Inverse Cartesian Double",
+                "Inverse Interleaved Double",
+                "Inverse Polar Double",
+                "Inverse Cepstral Double",
+                "Inverse Cartesian Float",
+                "Inverse Interleaved Float",
+                "Inverse Polar Float",
+                "Inverse Cepstral Float"
+            };
+            os << names[type] << " :: ";
+
+            for (std::map<FFTImpl *, int>::iterator ci = candidates.begin();
+                 ci != candidates.end(); ++ci) {
+
+                FFTImpl *d = ci->first;
+
+                double mean = 0;
+
+                clock_t start = clock();
+                
+                for (int i = 0; i < iterations; ++i) {
+
+                    if (i == 0) {
+                        for (int j = 0; j < size; ++j) {
+                            dc[j] = 0;
+                            dd[j] = 0;
+                            fc[j] = 0;
+                            fd[j] = 0;
+                            fj[j] = 0;
+                            dj[j] = 0;
+                        }
+                    }
+
+                    switch (type) {
+                    case 0: d->forward(da, dc, dd); break;
+                    case 1: d->forwardInterleaved(da, dj); break;
+                    case 2: d->forwardPolar(da, dc, dd); break;
+                    case 3: d->forwardMagnitude(da, dc); break;
+                    case 4: d->forward(fa, fc, fd); break;
+                    case 5: d->forwardInterleaved(fa, fj); break;
+                    case 6: d->forwardPolar(fa, fc, fd); break;
+                    case 7: d->forwardMagnitude(fa, fc); break;
+                    case 8: d->inverse(da, db, dc); break;
+                    case 9: d->inverseInterleaved(di, dc); break;
+                    case 10: d->inversePolar(da, db, dc); break;
+                    case 11: d->inverseCepstral(da, dc); break;
+                    case 12: d->inverse(fa, fb, fc); break;
+                    case 13: d->inverseInterleaved(fi, fc); break;
+                    case 14: d->inversePolar(fa, fb, fc); break;
+                    case 15: d->inverseCepstral(fa, fc); break;
+                    }
+
+                    if (i == 0) {
+                        mean = 0;
+                        for (int j = 0; j < size; ++j) {
+                            mean += dc[j];
+                            mean += dd[j];
+                            mean += fc[j];
+                            mean += fd[j];
+                            mean += fj[j];
+                            mean += dj[j];
+                        }
+                        mean /= size * 6;
+                    }
+                }
+
+                clock_t end = clock();
+
+                os << float(end - start)/divisor << " (" << mean << ") ";
+
+                if (low == -1 || (end - start) < lowscore) {
+                    low = ci->second;
+                    lowscore = end - start;
+                }
+            }
+
+            os << std::endl;
+
+            os << "  size " << size << ", type " << type << ": fastest is " << low << " (time " << float(lowscore)/divisor << ")" << std::endl;
+
+            wins[low]++;
+        }
+        
+        delete[] fa;
+        delete[] fb;
+        delete[] fc;
+        delete[] fd;
+        delete[] da;
+        delete[] db;
+        delete[] dc;
+        delete[] dd;
+    }
+
+    while (!candidates.empty()) {
+        delete candidates.begin()->first;
+        candidates.erase(candidates.begin());
+    }
+
+    int bestscore = 0;
+    int best = -1;
+
+    for (std::map<int, int>::iterator wi = wins.begin(); wi != wins.end(); ++wi) {
+        if (best == -1 || wi->second > bestscore) {
+            best = wi->first;
+            bestscore = wi->second;
+        }
+    }
+
+    os << "overall winner is " << best << " with " << bestscore << " wins" << std::endl;
+
+    return os.str();
 }
 
+#endif
 
 }
diff --git a/src/dsp/FFT.h b/src/dsp/FFT.h
index ebd6453..b3b4480 100644
--- a/src/dsp/FFT.h
+++ b/src/dsp/FFT.h
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _RUBBERBAND_FFT_H_
@@ -17,6 +26,9 @@
 
 #include "system/sysutils.h"
 
+#include <string>
+#include <set>
+
 namespace RubberBand {
 
 class FFTImpl;
@@ -41,7 +53,7 @@ class FFTImpl;
 class FFT
 {
 public:
-    enum Exception { InvalidSize };
+    enum Exception { InvalidSize, InvalidImplementation, InternalError };
 
     FFT(int size, int debugLevel = 0); // may throw InvalidSize
     ~FFT();
@@ -73,11 +85,36 @@ public:
     void initFloat();
     void initDouble();
 
-    static void tune();
+    enum Precision {
+        SinglePrecision = 0x1,
+        DoublePrecision = 0x2
+    };
+    typedef int Precisions;
+
+    /**
+     * Return the OR of all precisions supported by this
+     * implementation. All of the functions (float and double) are
+     * available regardless of the supported implementations, but they
+     * will be calculated at the proper precision only if it is
+     * available. (So float functions will be calculated using doubles
+     * and then truncated if single-precision is unavailable, and
+     * double functions will use single-precision arithmetic if double
+     * is unavailable.)
+     */
+    Precisions getSupportedPrecisions() const;
+
+    static std::set<std::string> getImplementations();
+    static std::string getDefaultImplementation();
+    static void setDefaultImplementation(std::string);
+
+#ifdef FFT_MEASUREMENT
+    static std::string tune();
+#endif
 
 protected:
     FFTImpl *d;
-    static int m_method;
+    static std::string m_implementation;
+    static void pickDefaultImplementation();
 };
 
 }
diff --git a/src/dsp/MovingMedian.h b/src/dsp/MovingMedian.h
index 9943479..c867fe2 100644
--- a/src/dsp/MovingMedian.h
+++ b/src/dsp/MovingMedian.h
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _MOVING_MEDIAN_H_
diff --git a/src/dsp/Resampler.cpp b/src/dsp/Resampler.cpp
index b0232b7..649abbc 100644
--- a/src/dsp/Resampler.cpp
+++ b/src/dsp/Resampler.cpp
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*- vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include "Resampler.h"
@@ -22,6 +31,11 @@
 
 #include "system/Allocators.h"
 
+#ifdef HAVE_IPP
+#include <ipps.h>
+#include <ippsr.h>
+#include <ippac.h>
+#endif
 
 #ifdef HAVE_LIBSAMPLERATE
 #include <samplerate.h>
@@ -31,12 +45,19 @@
 #include <libresample.h>
 #endif
 
+#ifdef USE_SPEEX
+#include "speex/speex_resampler.h"
+#endif
 
+#ifndef HAVE_IPP
 #ifndef HAVE_LIBSAMPLERATE
 #ifndef HAVE_LIBRESAMPLE
+#ifndef USE_SPEEX
 #error No resampler implementation selected!
 #endif
 #endif
+#endif
+#endif
 
 namespace RubberBand {
 
@@ -64,6 +85,360 @@ public:
 
 namespace Resamplers {
 
+#ifdef HAVE_IPP
+
+class D_IPP : public ResamplerImpl
+{
+public:
+    D_IPP(Resampler::Quality quality, int channels, int maxBufferSize,
+          int debugLevel);
+    ~D_IPP();
+
+    int resample(const float *const R__ *const R__ in,
+                 float *const R__ *const R__ out,
+                 int incount,
+                 float ratio,
+                 bool final);
+
+    int resampleInterleaved(const float *const R__ in,
+                            float *const R__ out,
+                            int incount,
+                            float ratio,
+                            bool final = false);
+
+    int getChannelCount() const { return m_channels; }
+
+    void reset();
+
+protected:
+    IppsResamplingPolyphase_32f **m_state;
+    float **m_inbuf;
+    size_t m_inbufsz;
+    float **m_outbuf;
+    size_t m_outbufsz;
+    int m_bufsize;
+    int m_channels;
+    int m_window;
+    float m_factor;
+    int m_history;
+    int *m_lastread;
+    double *m_time;
+    int m_debugLevel;
+    
+    void setBufSize(int);
+};
+
+D_IPP::D_IPP(Resampler::Quality quality, int channels, int maxBufferSize,
+             int debugLevel) :
+    m_state(0),
+    m_channels(channels),
+    m_debugLevel(debugLevel)
+{
+    if (m_debugLevel > 0) {
+        std::cerr << "Resampler::Resampler: using IPP implementation"
+                  << std::endl;
+    }
+
+    int nStep;
+    IppHintAlgorithm hint;
+
+    switch (quality) {
+
+    case Resampler::Best:
+        m_window = 64;
+        nStep = 80;
+        hint = ippAlgHintAccurate;
+        break;
+
+    case Resampler::FastestTolerable:
+//        m_window = 48;
+        nStep = 16;
+        m_window = 16;
+//        nStep = 8;
+        hint = ippAlgHintFast;
+        break;
+
+    case Resampler::Fastest:
+        m_window = 24;
+        nStep = 64;
+        hint = ippAlgHintFast;
+        break;
+    }
+
+    m_factor = 8; // initial upper bound on m_ratio, may be amended later
+    m_history = int(m_window * 0.5 * std::max(1.0, 1.0 / m_factor)) + 1;
+
+    m_state = new IppsResamplingPolyphase_32f *[m_channels];
+
+    m_lastread = new int[m_channels];
+    m_time = new double[m_channels];
+
+    m_bufsize = maxBufferSize + m_history;
+
+    if (m_debugLevel > 1) {
+        std::cerr << "bufsize = " << m_bufsize << ", window = " << m_window << ", nStep = " << nStep << ", history = " << m_history << std::endl;
+    }
+
+    for (int c = 0; c < m_channels; ++c) {
+        ippsResamplePolyphaseInitAlloc_32f(&m_state[c],
+                                           float(m_window),
+                                           nStep,
+                                           0.95f,
+                                           9.0f,
+                                           hint);
+        m_lastread[c] = m_history;
+        m_time[c] = m_history;
+    }
+
+    m_inbufsz = m_bufsize + m_history + 2;
+    if (m_debugLevel > 1) {
+        std::cerr << "inbuf allocating " << m_bufsize << " + " << m_history << " + 2 = " << m_inbufsz << std::endl;
+    }
+
+    m_outbufsz = lrintf(ceil((m_bufsize - m_history) * m_factor + 2));
+    if (m_debugLevel > 1) {
+        std::cerr << "outbuf allocating (" << m_bufsize << " - " << m_history << ") * " << m_factor << " + 2 = " << m_outbufsz << std::endl;
+    }
+
+    m_inbuf  = allocate_and_zero_channels<float>(m_channels, m_inbufsz);
+    m_outbuf = allocate_and_zero_channels<float>(m_channels, m_outbufsz);
+
+    if (m_debugLevel > 1) {
+        std::cerr << "Resampler init done" << std::endl;
+    }
+}
+
+D_IPP::~D_IPP()
+{
+    for (int c = 0; c < m_channels; ++c) {
+        ippsResamplePolyphaseFree_32f(m_state[c]);
+    }
+
+    deallocate_channels(m_inbuf, m_channels);
+    deallocate_channels(m_outbuf, m_channels);
+
+    delete[] m_lastread;
+    delete[] m_time;
+    delete[] m_state;
+}
+
+void
+D_IPP::setBufSize(int sz)
+{
+    if (m_debugLevel > 1) {
+        std::cerr << "resize bufsize " << m_bufsize << " -> ";
+    }
+
+    m_bufsize = sz;
+
+    std::cerr << m_bufsize << std::endl;
+
+    int n1 = m_bufsize + m_history + 2;
+    int n2 = lrintf(ceil((m_bufsize - m_history) * m_factor + 2));
+
+    if (m_debugLevel > 1) {
+        std::cerr << "(outbufsize = " << n2 << ")" << std::endl;
+    }
+
+    m_inbuf = reallocate_and_zero_extend_channels
+        (m_inbuf, m_channels, m_inbufsz, m_channels, n1);
+
+    m_outbuf = reallocate_and_zero_extend_channels
+        (m_outbuf, m_channels, m_outbufsz, m_channels, n2);
+            
+    m_inbufsz = n1;
+    m_outbufsz = n2;
+}
+
+int
+D_IPP::resample(const float *const R__ *const R__ in,
+                float *const R__ *const R__ out,
+                int incount,
+                float ratio,
+                bool final)
+{
+    int outcount = 0;
+
+    if (ratio > m_factor) {
+        m_factor = ratio;
+        m_history = int(m_window * 0.5 * std::max(1.0, 1.0 / m_factor)) + 1;
+    }
+
+    for (int c = 0; c < m_channels; ++c) {
+        if (m_lastread[c] + incount + m_history > m_bufsize) {
+            setBufSize(m_lastread[c] + incount + m_history);
+        }
+    }
+
+    for (int c = 0; c < m_channels; ++c) {
+
+        for (int i = 0; i < incount; ++i) {
+            m_inbuf[c][m_lastread[c] + i] = in[c][i];
+        }
+        m_lastread[c] += incount;
+        
+        ippsResamplePolyphase_32f(m_state[c],
+                                  m_inbuf[c],
+                                  m_lastread[c] - m_history - int(m_time[c]),
+                                  m_outbuf[c],
+                                  ratio,
+                                  0.97f,
+                                  &m_time[c],
+                                  &outcount);
+
+        v_copy(out[c], m_outbuf[c], outcount);
+
+        ippsMove_32f(m_inbuf[c] + int(m_time[c]) - m_history,
+                     m_inbuf[c],
+                     m_lastread[c] + m_history - int(m_time[c]));
+
+        m_lastread[c] -= int(m_time[c]) - m_history;
+        m_time[c] -= int(m_time[c]) - m_history;
+
+        if (final) {
+
+            // Looks like this actually produces too many samples
+            // (additionalcount is a few samples too large).
+
+            // Also, we aren't likely to have enough space in the
+            // output buffer as the caller won't have allowed for
+            // all the samples we're retrieving here.
+
+            // What to do?
+
+            int additionalcount = 0;
+
+            for (int i = 0; i < m_history; ++i) {
+                m_inbuf[c][m_lastread[c] + i] = 0.f;
+            }
+            
+            ippsResamplePolyphase_32f(m_state[c],
+                                      m_inbuf[c],
+                                      m_lastread[c] - int(m_time[c]),
+                                      m_outbuf[c],
+                                      ratio,
+                                      0.97f,
+                                      &m_time[c],
+                                      &additionalcount);
+
+            if (m_debugLevel > 2) {
+                std::cerr << "incount = " << incount << ", outcount = " << outcount << ", additionalcount = " << additionalcount << ", sum " << outcount + additionalcount << ", est space = " << lrintf(ceil(incount * ratio)) <<std::endl;
+            }
+
+            v_copy(out[c] + outcount, m_outbuf[c], additionalcount);
+
+            outcount += additionalcount;
+        }
+    }
+
+    for (int c = 0; c < m_channels; ++c) {
+        ippsThreshold_32f_I(out[c], outcount, 1.f, ippCmpGreater);
+        ippsThreshold_32f_I(out[c], outcount, -1.f, ippCmpLess);
+    }
+
+    return outcount;
+}
+
+int
+D_IPP::resampleInterleaved(const float *const R__ in,
+                           float *const R__ out,
+                           int incount,
+                           float ratio,
+                           bool final)
+{
+    int outcount = 0;
+
+    if (ratio > m_factor) {
+        m_factor = ratio;
+        m_history = int(m_window * 0.5 * std::max(1.0, 1.0 / m_factor)) + 1;
+    }
+
+    for (int c = 0; c < m_channels; ++c) {
+        if (m_lastread[c] + incount + m_history > m_bufsize) {
+            setBufSize(m_lastread[c] + incount + m_history);
+        }
+    }
+
+    for (int c = 0; c < m_channels; ++c) {
+
+        for (int i = 0; i < incount; ++i) {
+            m_inbuf[c][m_lastread[c] + i] = in[i * m_channels + c];
+        }
+        m_lastread[c] += incount;
+        
+        ippsResamplePolyphase_32f(m_state[c],
+                                  m_inbuf[c],
+                                  m_lastread[c] - m_history - int(m_time[c]),
+                                  m_outbuf[c],
+                                  ratio,
+                                  0.97f,
+                                  &m_time[c],
+                                  &outcount);
+
+        ippsMove_32f(m_inbuf[c] + int(m_time[c]) - m_history,
+                     m_inbuf[c],
+                     m_lastread[c] + m_history - int(m_time[c]));
+
+        m_lastread[c] -= int(m_time[c]) - m_history;
+        m_time[c] -= int(m_time[c]) - m_history;
+    }
+
+    v_interleave(out, m_outbuf, m_channels, outcount);
+
+    if (final) {
+
+        // Looks like this actually produces too many samples
+        // (additionalcount is a few samples too large).
+
+        // Also, we aren't likely to have enough space in the
+        // output buffer as the caller won't have allowed for
+        // all the samples we're retrieving here.
+
+        // What to do?
+
+        int additionalcount = 0;
+        
+        for (int c = 0; c < m_channels; ++c) {
+
+            for (int i = 0; i < m_history; ++i) {
+                m_inbuf[c][m_lastread[c] + i] = 0.f;
+            }
+            
+            ippsResamplePolyphase_32f(m_state[c],
+                                      m_inbuf[c],
+                                      m_lastread[c] - int(m_time[c]),
+                                      m_outbuf[c],
+                                      ratio,
+                                      0.97f,
+                                      &m_time[c],
+                                      &additionalcount);
+
+            if (m_debugLevel > 2) {
+                std::cerr << "incount = " << incount << ", outcount = " << outcount << ", additionalcount = " << additionalcount << ", sum " << outcount + additionalcount << ", est space = " << lrintf(ceil(incount * ratio)) <<std::endl;
+            }
+        }
+
+        v_interleave(out + (outcount * m_channels),
+                     m_outbuf,
+                     m_channels,
+                     additionalcount);
+
+        outcount += additionalcount;
+    }
+
+    ippsThreshold_32f_I(out, outcount * m_channels, 1.f, ippCmpGreater);
+    ippsThreshold_32f_I(out, outcount * m_channels, -1.f, ippCmpLess);
+
+    return outcount;
+}
+
+void
+D_IPP::reset()
+{
+    //!!!
+}
+
+#endif /* HAVE_IPP */
 
 #ifdef HAVE_LIBSAMPLERATE
 
@@ -126,7 +501,9 @@ D_SRC::D_SRC(Resampler::Quality quality, int channels, int maxBufferSize,
     if (err) {
         std::cerr << "Resampler::Resampler: failed to create libsamplerate resampler: " 
                   << src_strerror(err) << std::endl;
+#ifndef NO_EXCEPTIONS
         throw Resampler::ImplementationError;
+#endif
     }
 
     if (maxBufferSize > 0 && m_channels > 1) {
@@ -184,7 +561,9 @@ D_SRC::resample(const float *const R__ *const R__ in,
     if (err) {
         std::cerr << "Resampler::process: libsamplerate error: "
                   << src_strerror(err) << std::endl;
+#ifndef NO_EXCEPTIONS
         throw Resampler::ImplementationError;
+#endif
     }
 
     if (m_channels > 1) {
@@ -220,7 +599,9 @@ D_SRC::resampleInterleaved(const float *const R__ in,
     if (err) {
         std::cerr << "Resampler::process: libsamplerate error: "
                   << src_strerror(err) << std::endl;
+#ifndef NO_EXCEPTIONS
         throw Resampler::ImplementationError;
+#endif
     }
 
     m_lastRatio = ratio;
@@ -424,6 +805,234 @@ D_Resample::reset()
 
 #endif /* HAVE_LIBRESAMPLE */
 
+#ifdef USE_SPEEX
+    
+class D_Speex : public ResamplerImpl
+{
+public:
+    D_Speex(Resampler::Quality quality, int channels, int maxBufferSize,
+            int debugLevel);
+    ~D_Speex();
+
+    int resample(const float *const R__ *const R__ in,
+                 float *const R__ *const R__ out,
+                 int incount,
+                 float ratio,
+                 bool final);
+
+    int resampleInterleaved(const float *const R__ in,
+                            float *const R__ out,
+                            int incount,
+                            float ratio,
+                            bool final = false);
+
+    int getChannelCount() const { return m_channels; }
+
+    void reset();
+
+protected:
+    SpeexResamplerState *m_resampler;
+    float *m_iin;
+    float *m_iout;
+    int m_channels;
+    int m_iinsize;
+    int m_ioutsize;
+    float m_lastratio;
+    bool m_initial;
+    int m_debugLevel;
+
+    void setRatio(float);
+};
+
+D_Speex::D_Speex(Resampler::Quality quality, int channels, int maxBufferSize,
+                 int debugLevel) :
+    m_resampler(0),
+    m_iin(0),
+    m_iout(0),
+    m_channels(channels),
+    m_iinsize(0),
+    m_ioutsize(0),
+    m_lastratio(1),
+    m_initial(true),
+    m_debugLevel(debugLevel)
+{
+    int q = (quality == Resampler::Best ? 10 :
+             quality == Resampler::Fastest ? 0 : 4);
+
+    if (m_debugLevel > 0) {
+        std::cerr << "Resampler::Resampler: using Speex implementation with q = "
+                  << q 
+                  << std::endl;
+    }
+
+    int err = 0;
+    m_resampler = speex_resampler_init_frac(m_channels,
+                                            1, 1,
+                                            48000, 48000, // irrelevant
+                                            q,
+                                            &err);
+    
+
+    if (err) {
+        std::cerr << "Resampler::Resampler: failed to create Speex resampler" 
+                  << std::endl;
+#ifndef NO_EXCEPTIONS
+        throw Resampler::ImplementationError;
+#endif
+    }
+
+    if (maxBufferSize > 0 && m_channels > 1) {
+        m_iinsize = maxBufferSize * m_channels;
+        m_ioutsize = maxBufferSize * m_channels * 2;
+        m_iin = allocate<float>(m_iinsize);
+        m_iout = allocate<float>(m_ioutsize);
+    }
+}
+
+D_Speex::~D_Speex()
+{
+    speex_resampler_destroy(m_resampler);
+    deallocate<float>(m_iin);
+    deallocate<float>(m_iout);
+}
+
+void
+D_Speex::setRatio(float ratio)
+{
+    // Speex wants a ratio of two unsigned integers, not a single
+    // float.  Let's do that.
+
+    unsigned int big = 272408136U; 
+    unsigned int denom = 1, num = 1;
+
+    if (ratio < 1.f) {
+        denom = big;
+        double dnum = double(big) * double(ratio);
+        num = (unsigned int)dnum;
+    } else if (ratio > 1.f) {
+        num = big;
+        double ddenom = double(big) / double(ratio);
+        denom = (unsigned int)ddenom;
+    }
+    
+    if (m_debugLevel > 1) {
+        std::cerr << "D_Speex: Desired ratio " << ratio << ", requesting ratio "
+                  << num << "/" << denom << " = " << float(double(num)/double(denom))
+                  << std::endl;
+    }
+    
+    int err = speex_resampler_set_rate_frac
+        (m_resampler, denom, num, 48000, 48000);
+    //!!! check err
+    
+    speex_resampler_get_ratio(m_resampler, &denom, &num);
+    
+    if (m_debugLevel > 1) {
+        std::cerr << "D_Speex: Desired ratio " << ratio << ", got ratio "
+                  << num << "/" << denom << " = " << float(double(num)/double(denom))
+                  << std::endl;
+    }
+    
+    m_lastratio = ratio;
+
+    if (m_initial) {
+        speex_resampler_skip_zeros(m_resampler);
+        m_initial = false;
+    }
+}
+
+int
+D_Speex::resample(const float *const R__ *const R__ in,
+                  float *const R__ *const R__ out,
+                  int incount,
+                  float ratio,
+                  bool final)
+{
+    if (ratio != m_lastratio) {
+        setRatio(ratio);
+    }
+
+    unsigned int uincount = incount;
+    unsigned int outcount = lrintf(ceilf(incount * ratio)); //!!! inexact now
+
+    float *data_in, *data_out;
+
+    if (m_channels == 1) {
+        data_in = const_cast<float *>(*in);
+        data_out = *out;
+    } else {
+        if (incount * m_channels > m_iinsize) {
+            m_iin = reallocate<float>(m_iin, m_iinsize, incount * m_channels);
+            m_iinsize = incount * m_channels;
+        }
+        if (outcount * m_channels > m_ioutsize) {
+            m_iout = reallocate<float>(m_iout, m_ioutsize, outcount * m_channels);
+            m_ioutsize = outcount * m_channels;
+        }
+        v_interleave(m_iin, in, m_channels, incount);
+        data_in = m_iin;
+        data_out = m_iout;
+    }
+
+    int err = speex_resampler_process_interleaved_float(m_resampler,
+                                                        data_in,
+                                                        &uincount,
+                                                        data_out,
+                                                        &outcount);
+
+//    if (incount != int(uincount)) {
+//        std::cerr << "Resampler: NOTE: Consumed " << uincount
+//                  << " of " << incount << " frames" << std::endl;
+//    }
+
+//    if (outcount != lrintf(ceilf(incount * ratio))) {
+//        std::cerr << "Resampler: NOTE: Obtained " << outcount
+//                  << " of " << lrintf(ceilf(incount * ratio)) << " frames"
+//                  << std::endl;
+//    }
+        
+    //!!! check err, respond appropriately
+
+    if (m_channels > 1) {
+        v_deinterleave(out, m_iout, m_channels, outcount);
+    }
+
+    return outcount;
+}
+
+int
+D_Speex::resampleInterleaved(const float *const R__ in,
+                             float *const R__ out,
+                             int incount,
+                             float ratio,
+                             bool final)
+{
+    if (ratio != m_lastratio) {
+        setRatio(ratio);
+    }
+
+    unsigned int uincount = incount;
+    unsigned int outcount = lrintf(ceilf(incount * ratio)); //!!! inexact now
+
+    float *data_in = const_cast<float *>(in);
+    float *data_out = out;
+
+    int err = speex_resampler_process_interleaved_float(m_resampler,
+                                                        data_in,
+                                                        &uincount,
+                                                        data_out,
+                                                        &outcount);
+
+    return outcount;
+}
+
+void
+D_Speex::reset()
+{
+    speex_resampler_reset_mem(m_resampler);
+}
+
+#endif
 
 } /* end namespace Resamplers */
 
@@ -435,6 +1044,12 @@ Resampler::Resampler(Resampler::Quality quality, int channels,
     switch (quality) {
 
     case Resampler::Best:
+#ifdef HAVE_IPP
+        m_method = 0;
+#endif
+#ifdef USE_SPEEX
+        m_method = 2;
+#endif
 #ifdef HAVE_LIBRESAMPLE
         m_method = 3;
 #endif
@@ -444,18 +1059,30 @@ Resampler::Resampler(Resampler::Quality quality, int channels,
         break;
 
     case Resampler::FastestTolerable:
+#ifdef HAVE_IPP
+        m_method = 0;
+#endif
 #ifdef HAVE_LIBRESAMPLE
         m_method = 3;
 #endif
 #ifdef HAVE_LIBSAMPLERATE
         m_method = 1;
+#endif
+#ifdef USE_SPEEX
+        m_method = 2;
 #endif
         break;
 
     case Resampler::Fastest:
+#ifdef HAVE_IPP
+        m_method = 0;
+#endif
 #ifdef HAVE_LIBRESAMPLE
         m_method = 3;
 #endif
+#ifdef USE_SPEEX
+        m_method = 2;
+#endif
 #ifdef HAVE_LIBSAMPLERATE
         m_method = 1;
 #endif
@@ -471,10 +1098,14 @@ Resampler::Resampler(Resampler::Quality quality, int channels,
 
     switch (m_method) {
     case 0:
+#ifdef HAVE_IPP
+        d = new Resamplers::D_IPP(quality, channels, maxBufferSize, debugLevel);
+#else
         std::cerr << "Resampler::Resampler(" << quality << ", " << channels
                   << ", " << maxBufferSize << "): No implementation available!"
                   << std::endl;
         abort();
+#endif
         break;
 
     case 1:
@@ -489,10 +1120,14 @@ Resampler::Resampler(Resampler::Quality quality, int channels,
         break;
 
     case 2:
+#ifdef USE_SPEEX
+        d = new Resamplers::D_Speex(quality, channels, maxBufferSize, debugLevel);
+#else
         std::cerr << "Resampler::Resampler(" << quality << ", " << channels
                   << ", " << maxBufferSize << "): No implementation available!"
                   << std::endl;
         abort();
+#endif
         break;
 
     case 3:
diff --git a/src/dsp/Resampler.h b/src/dsp/Resampler.h
index b54fb8d..f91157c 100644
--- a/src/dsp/Resampler.h
+++ b/src/dsp/Resampler.h
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*- vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _RUBBERBAND_RESAMPLER_H_
diff --git a/src/dsp/SampleFilter.h b/src/dsp/SampleFilter.h
index 43d53f6..c2589f4 100644
--- a/src/dsp/SampleFilter.h
+++ b/src/dsp/SampleFilter.h
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _SAMPLE_FILTER_H_
diff --git a/src/dsp/SincWindow.h b/src/dsp/SincWindow.h
index e1cddee..3d917c8 100644
--- a/src/dsp/SincWindow.h
+++ b/src/dsp/SincWindow.h
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _RUBBERBAND_SINC_WINDOW_H_
diff --git a/src/dsp/Window.cpp b/src/dsp/Window.cpp
deleted file mode 100644
index 106faa7..0000000
--- a/src/dsp/Window.cpp
+++ /dev/null
@@ -1,17 +0,0 @@
-/* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
-
-/*
-    Rubber Band
-    An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2008 Chris Cannam.
-    
-    This program is free software; you can redistribute it and/or
-    modify it under the terms of the GNU General Public License as
-    published by the Free Software Foundation; either version 2 of the
-    License, or (at your option) any later version.  See the file
-    COPYING included with this distribution for more information.
-*/
-
-#include "Window.h"
-
-
diff --git a/src/dsp/Window.h b/src/dsp/Window.h
index fa06fd3..6ffa184 100644
--- a/src/dsp/Window.h
+++ b/src/dsp/Window.h
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _RUBBERBAND_WINDOW_H_
diff --git a/src/getopt/getopt.h b/src/getopt/getopt.h
index 2cd3191..d95d6cf 100644
--- a/src/getopt/getopt.h
+++ b/src/getopt/getopt.h
@@ -107,4 +107,4 @@ GETOPT_API int getopt __P((int, char * const *, const char *));
 __END_DECLS
 #endif
  
-#endif 
+#endif /* !_GETOPT_H_ */
diff --git a/src/jni/RubberBandStretcherJNI.cpp b/src/jni/RubberBandStretcherJNI.cpp
new file mode 100644
index 0000000..cf6d887
--- /dev/null
+++ b/src/jni/RubberBandStretcherJNI.cpp
@@ -0,0 +1,370 @@
+/* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
+/* Copyright Chris Cannam - All Rights Reserved */
+
+#include "rubberband/RubberBandStretcher.h"
+
+#include "system/Allocators.h"
+
+#include <jni.h>
+
+using namespace RubberBand;
+
+extern "C" {
+
+/*
+ * Class:     com_breakfastquay_rubberband_RubberBandStretcher
+ * Method:    dispose
+ * Signature: ()V
+ */
+JNIEXPORT void JNICALL Java_com_breakfastquay_rubberband_RubberBandStretcher_dispose
+  (JNIEnv *, jobject);
+
+/*
+ * Class:     com_breakfastquay_rubberband_RubberBandStretcher
+ * Method:    reset
+ * Signature: ()V
+ */
+JNIEXPORT void JNICALL Java_com_breakfastquay_rubberband_RubberBandStretcher_reset
+  (JNIEnv *, jobject);
+
+/*
+ * Class:     com_breakfastquay_rubberband_RubberBandStretcher
+ * Method:    setTimeRatio
+ * Signature: (D)V
+ */
+JNIEXPORT void JNICALL Java_com_breakfastquay_rubberband_RubberBandStretcher_setTimeRatio
+  (JNIEnv *, jobject, jdouble);
+
+/*
+ * Class:     com_breakfastquay_rubberband_RubberBandStretcher
+ * Method:    setPitchScale
+ * Signature: (D)V
+ */
+JNIEXPORT void JNICALL Java_com_breakfastquay_rubberband_RubberBandStretcher_setPitchScale
+  (JNIEnv *, jobject, jdouble);
+
+/*
+ * Class:     com_breakfastquay_rubberband_RubberBandStretcher
+ * Method:    getChannelCount
+ * Signature: ()I
+ */
+JNIEXPORT jint JNICALL Java_com_breakfastquay_rubberband_RubberBandStretcher_getChannelCount
+  (JNIEnv *, jobject);
+
+/*
+ * Class:     com_breakfastquay_rubberband_RubberBandStretcher
+ * Method:    getTimeRatio
+ * Signature: ()D
+ */
+JNIEXPORT jdouble JNICALL Java_com_breakfastquay_rubberband_RubberBandStretcher_getTimeRatio
+  (JNIEnv *, jobject);
+
+/*
+ * Class:     com_breakfastquay_rubberband_RubberBandStretcher
+ * Method:    getPitchScale
+ * Signature: ()D
+ */
+JNIEXPORT jdouble JNICALL Java_com_breakfastquay_rubberband_RubberBandStretcher_getPitchScale
+  (JNIEnv *, jobject);
+
+/*
+ * Class:     com_breakfastquay_rubberband_RubberBandStretcher
+ * Method:    getLatency
+ * Signature: ()I
+ */
+JNIEXPORT jint JNICALL Java_com_breakfastquay_rubberband_RubberBandStretcher_getLatency
+  (JNIEnv *, jobject);
+
+/*
+ * Class:     com_breakfastquay_rubberband_RubberBandStretcher
+ * Method:    setTransientsOption
+ * Signature: (I)V
+ */
+JNIEXPORT void JNICALL Java_com_breakfastquay_rubberband_RubberBandStretcher_setTransientsOption
+  (JNIEnv *, jobject, jint);
+
+/*
+ * Class:     com_breakfastquay_rubberband_RubberBandStretcher
+ * Method:    setDetectorOption
+ * Signature: (I)V
+ */
+JNIEXPORT void JNICALL Java_com_breakfastquay_rubberband_RubberBandStretcher_setDetectorOption
+  (JNIEnv *, jobject, jint);
+
+/*
+ * Class:     com_breakfastquay_rubberband_RubberBandStretcher
+ * Method:    setPhaseOption
+ * Signature: (I)V
+ */
+JNIEXPORT void JNICALL Java_com_breakfastquay_rubberband_RubberBandStretcher_setPhaseOption
+  (JNIEnv *, jobject, jint);
+
+/*
+ * Class:     com_breakfastquay_rubberband_RubberBandStretcher
+ * Method:    setFormantOption
+ * Signature: (I)V
+ */
+JNIEXPORT void JNICALL Java_com_breakfastquay_rubberband_RubberBandStretcher_setFormantOption
+  (JNIEnv *, jobject, jint);
+
+/*
+ * Class:     com_breakfastquay_rubberband_RubberBandStretcher
+ * Method:    setPitchOption
+ * Signature: (I)V
+ */
+JNIEXPORT void JNICALL Java_com_breakfastquay_rubberband_RubberBandStretcher_setPitchOption
+  (JNIEnv *, jobject, jint);
+
+/*
+ * Class:     com_breakfastquay_rubberband_RubberBandStretcher
+ * Method:    setExpectedInputDuration
+ * Signature: (J)V
+ */
+JNIEXPORT void JNICALL Java_com_breakfastquay_rubberband_RubberBandStretcher_setExpectedInputDuration
+  (JNIEnv *, jobject, jlong);
+
+/*
+ * Class:     com_breakfastquay_rubberband_RubberBandStretcher
+ * Method:    setMaxProcessSize
+ * Signature: (I)V
+ */
+JNIEXPORT void JNICALL Java_com_breakfastquay_rubberband_RubberBandStretcher_setMaxProcessSize
+  (JNIEnv *, jobject, jint);
+
+/*
+ * Class:     com_breakfastquay_rubberband_RubberBandStretcher
+ * Method:    getSamplesRequired
+ * Signature: ()I
+ */
+JNIEXPORT jint JNICALL Java_com_breakfastquay_rubberband_RubberBandStretcher_getSamplesRequired
+  (JNIEnv *, jobject);
+
+/*
+ * Class:     com_breakfastquay_rubberband_RubberBandStretcher
+ * Method:    study
+ * Signature: ([[FZ)V
+ */
+JNIEXPORT void JNICALL Java_com_breakfastquay_rubberband_RubberBandStretcher_study
+  (JNIEnv *, jobject, jobjectArray, jboolean);
+
+/*
+ * Class:     com_breakfastquay_rubberband_RubberBandStretcher
+ * Method:    process
+ * Signature: ([[FZ)V
+ */
+JNIEXPORT void JNICALL Java_com_breakfastquay_rubberband_RubberBandStretcher_process
+  (JNIEnv *, jobject, jobjectArray, jboolean);
+
+/*
+ * Class:     com_breakfastquay_rubberband_RubberBandStretcher
+ * Method:    available
+ * Signature: ()I
+ */
+JNIEXPORT jint JNICALL Java_com_breakfastquay_rubberband_RubberBandStretcher_available
+  (JNIEnv *, jobject);
+
+/*
+ * Class:     com_breakfastquay_rubberband_RubberBandStretcher
+ * Method:    retrieve
+ * Signature: (I)[[F
+ */
+JNIEXPORT jint JNICALL Java_com_breakfastquay_rubberband_RubberBandStretcher_retrieve
+  (JNIEnv *, jobject, jobjectArray);
+
+/*
+ * Class:     com_breakfastquay_rubberband_RubberBandStretcher
+ * Method:    initialise
+ * Signature: (IIIDD)V
+ */
+JNIEXPORT void JNICALL Java_com_breakfastquay_rubberband_RubberBandStretcher_initialise
+  (JNIEnv *, jobject, jint, jint, jint, jdouble, jdouble);
+
+}
+
+RubberBandStretcher *
+getStretcher(JNIEnv *env, jobject obj)
+{
+    jclass c = env->GetObjectClass(obj);
+    jfieldID fid = env->GetFieldID(c, "handle", "J");
+    jlong handle = env->GetLongField(obj, fid);
+    return (RubberBandStretcher *)handle;
+}
+
+void
+setStretcher(JNIEnv *env, jobject obj, RubberBandStretcher *stretcher)
+{
+    jclass c = env->GetObjectClass(obj);
+    jfieldID fid = env->GetFieldID(c, "handle", "J");
+    jlong handle = (jlong)stretcher;
+    env->SetLongField(obj, fid, handle);
+}
+
+void
+Java_com_breakfastquay_rubberband_RubberBandStretcher_initialise(JNIEnv *env, jobject obj, jint sampleRate, jint channels, jint options, jdouble initialTimeRatio, jdouble initialPitchScale)
+{
+    setStretcher(env, obj, new RubberBandStretcher
+                 (sampleRate, channels, options, initialTimeRatio, initialPitchScale));
+}
+
+void
+Java_com_breakfastquay_rubberband_RubberBandStretcher_dispose(JNIEnv *env, jobject obj)
+{
+    delete getStretcher(env, obj);
+    setStretcher(env, obj, 0);
+}
+
+void
+Java_com_breakfastquay_rubberband_RubberBandStretcher_reset(JNIEnv *env, jobject obj)
+{
+    getStretcher(env, obj)->reset();
+}
+
+void
+Java_com_breakfastquay_rubberband_RubberBandStretcher_setTimeRatio(JNIEnv *env, jobject obj, jdouble ratio)
+{
+    getStretcher(env, obj)->setTimeRatio(ratio);
+}
+
+void
+Java_com_breakfastquay_rubberband_RubberBandStretcher_setPitchScale(JNIEnv *env, jobject obj, jdouble scale)
+{
+    getStretcher(env, obj)->setPitchScale(scale);
+}
+
+jint
+Java_com_breakfastquay_rubberband_RubberBandStretcher_getChannelCount(JNIEnv *env, jobject obj)
+{
+    return getStretcher(env, obj)->getChannelCount();
+}
+
+jdouble
+Java_com_breakfastquay_rubberband_RubberBandStretcher_getTimeRatio(JNIEnv *env, jobject obj)
+{
+    return getStretcher(env, obj)->getTimeRatio();
+}
+
+jdouble
+Java_com_breakfastquay_rubberband_RubberBandStretcher_getPitchScale(JNIEnv *env, jobject obj)
+{
+    return getStretcher(env, obj)->getPitchScale();
+}
+
+jint
+Java_com_breakfastquay_rubberband_RubberBandStretcher_getLatency(JNIEnv *env, jobject obj)
+{
+    return getStretcher(env, obj)->getLatency();
+}
+
+void
+Java_com_breakfastquay_rubberband_RubberBandStretcher_setTransientsOption(JNIEnv *env, jobject obj, jint options)
+{
+    getStretcher(env, obj)->setTransientsOption(options);
+}
+
+void
+Java_com_breakfastquay_rubberband_RubberBandStretcher_setDetectorOption(JNIEnv *env, jobject obj, jint options)
+{
+    getStretcher(env, obj)->setDetectorOption(options);
+}
+
+void
+Java_com_breakfastquay_rubberband_RubberBandStretcher_setPhaseOption(JNIEnv *env, jobject obj, jint options)
+{
+    getStretcher(env, obj)->setPhaseOption(options);
+}
+
+void
+Java_com_breakfastquay_rubberband_RubberBandStretcher_setFormantOption(JNIEnv *env, jobject obj, jint options)
+{
+    getStretcher(env, obj)->setFormantOption(options);
+}
+
+void
+Java_com_breakfastquay_rubberband_RubberBandStretcher_setPitchOption(JNIEnv *env, jobject obj, jint options)
+{
+    getStretcher(env, obj)->setPitchOption(options);
+}
+
+void
+Java_com_breakfastquay_rubberband_RubberBandStretcher_setExpectedInputDuration(JNIEnv *env, jobject obj, jlong duration)
+{
+    getStretcher(env, obj)->setExpectedInputDuration(duration);
+}
+
+jint
+Java_com_breakfastquay_rubberband_RubberBandStretcher_getSamplesRequired(JNIEnv *env, jobject obj)
+{
+    return getStretcher(env, obj)->getSamplesRequired();
+}
+
+void
+Java_com_breakfastquay_rubberband_RubberBandStretcher_study(JNIEnv *env, jobject obj, jobjectArray data, jboolean final)
+{
+    int channels = env->GetArrayLength(data);
+    float **input = new float *[channels];
+    int samples = 0;
+    for (int c = 0; c < channels; ++c) {
+        jfloatArray cdata = (jfloatArray)env->GetObjectArrayElement(data, c);
+        samples = env->GetArrayLength(cdata);
+        input[c] = env->GetFloatArrayElements(cdata, 0);
+    }
+
+    getStretcher(env, obj)->study(input, samples, final);
+
+    for (int c = 0; c < channels; ++c) {
+        jfloatArray cdata = (jfloatArray)env->GetObjectArrayElement(data, c);
+        env->ReleaseFloatArrayElements(cdata, input[c], 0);
+    }
+}
+
+void
+Java_com_breakfastquay_rubberband_RubberBandStretcher_process(JNIEnv *env, jobject obj, jobjectArray data, jboolean final)
+{
+    int channels = env->GetArrayLength(data);
+    float **input = allocate<float *>(channels);
+    int samples = 0;
+    for (int c = 0; c < channels; ++c) {
+        jfloatArray cdata = (jfloatArray)env->GetObjectArrayElement(data, c);
+        samples = env->GetArrayLength(cdata);
+        input[c] = env->GetFloatArrayElements(cdata, 0);
+    }
+
+    getStretcher(env, obj)->process(input, samples, final);
+
+    for (int c = 0; c < channels; ++c) {
+        jfloatArray cdata = (jfloatArray)env->GetObjectArrayElement(data, c);
+        env->ReleaseFloatArrayElements(cdata, input[c], 0);
+    }
+
+    deallocate(input);
+}
+
+jint
+Java_com_breakfastquay_rubberband_RubberBandStretcher_available(JNIEnv *env, jobject obj)
+{
+    return getStretcher(env, obj)->available();
+}
+
+jint
+Java_com_breakfastquay_rubberband_RubberBandStretcher_retrieve(JNIEnv *env, jobject obj, jobjectArray output)
+{
+    RubberBandStretcher *stretcher = getStretcher(env, obj);
+    size_t channels = stretcher->getChannelCount();
+    
+    jfloatArray first = (jfloatArray)env->GetObjectArrayElement(output, 0);
+    int space = env->GetArrayLength(first);
+    env->DeleteLocalRef(first);
+
+    float **outbuf = allocate_channels<float>(channels, space);
+    size_t retrieved = stretcher->retrieve(outbuf, space);
+
+    for (int c = 0; c < channels; ++c) {
+        jfloatArray cdata = (jfloatArray)env->GetObjectArrayElement(output, c);
+        env->SetFloatArrayRegion(cdata, 0, retrieved, outbuf[c]);
+        env->DeleteLocalRef(cdata);
+    }
+    
+    deallocate_channels(outbuf, channels);
+    return retrieved;
+}
+
diff --git a/src/kissfft/COPYING b/src/kissfft/COPYING
new file mode 100644
index 0000000..b22325a
--- /dev/null
+++ b/src/kissfft/COPYING
@@ -0,0 +1,11 @@
+Copyright (c) 2003-2004 Mark Borgerding
+
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
+    * Neither the author nor the names of any contributors may be used to endorse or promote products derived from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
diff --git a/src/kissfft/_kiss_fft_guts.h b/src/kissfft/_kiss_fft_guts.h
new file mode 100644
index 0000000..1c1d4d7
--- /dev/null
+++ b/src/kissfft/_kiss_fft_guts.h
@@ -0,0 +1,150 @@
+/*
+Copyright (c) 2003-2004, Mark Borgerding
+
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
+    * Neither the author nor the names of any contributors may be used to endorse or promote products derived from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+/* kiss_fft.h
+   defines kiss_fft_scalar as either short or a float type
+   and defines
+   typedef struct { kiss_fft_scalar r; kiss_fft_scalar i; }kiss_fft_cpx; */
+#include "kiss_fft.h"
+#include <limits.h>
+
+#define MAXFACTORS 32
+/* e.g. an fft of length 128 has 4 factors 
+ as far as kissfft is concerned
+ 4*4*4*2
+ */
+
+struct kiss_fft_state{
+    int nfft;
+    int inverse;
+    int factors[2*MAXFACTORS];
+    kiss_fft_cpx twiddles[1];
+};
+
+/*
+  Explanation of macros dealing with complex math:
+
+   C_MUL(m,a,b)         : m = a*b
+   C_FIXDIV( c , div )  : if a fixed point impl., c /= div. noop otherwise
+   C_SUB( res, a,b)     : res = a - b
+   C_SUBFROM( res , a)  : res -= a
+   C_ADDTO( res , a)    : res += a
+ * */
+#ifdef FIXED_POINT
+#if (FIXED_POINT==32)
+# define FRACBITS 31
+# define SAMPPROD int64_t
+#define SAMP_MAX 2147483647
+#else
+# define FRACBITS 15
+# define SAMPPROD int32_t 
+#define SAMP_MAX 32767
+#endif
+
+#define SAMP_MIN -SAMP_MAX
+
+#if defined(CHECK_OVERFLOW)
+#  define CHECK_OVERFLOW_OP(a,op,b)  \
+	if ( (SAMPPROD)(a) op (SAMPPROD)(b) > SAMP_MAX || (SAMPPROD)(a) op (SAMPPROD)(b) < SAMP_MIN ) { \
+		fprintf(stderr,"WARNING:overflow @ " __FILE__ "(%d): (%d " #op" %d) = %ld\n",__LINE__,(a),(b),(SAMPPROD)(a) op (SAMPPROD)(b) );  }
+#endif
+
+
+#   define smul(a,b) ( (SAMPPROD)(a)*(b) )
+#   define sround( x )  (kiss_fft_scalar)( ( (x) + (1<<(FRACBITS-1)) ) >> FRACBITS )
+
+#   define S_MUL(a,b) sround( smul(a,b) )
+
+#   define C_MUL(m,a,b) \
+      do{ (m).r = sround( smul((a).r,(b).r) - smul((a).i,(b).i) ); \
+          (m).i = sround( smul((a).r,(b).i) + smul((a).i,(b).r) ); }while(0)
+
+#   define DIVSCALAR(x,k) \
+	(x) = sround( smul(  x, SAMP_MAX/k ) )
+
+#   define C_FIXDIV(c,div) \
+	do {    DIVSCALAR( (c).r , div);  \
+		DIVSCALAR( (c).i  , div); }while (0)
+
+#   define C_MULBYSCALAR( c, s ) \
+    do{ (c).r =  sround( smul( (c).r , s ) ) ;\
+        (c).i =  sround( smul( (c).i , s ) ) ; }while(0)
+
+#else  /* not FIXED_POINT*/
+
+#   define S_MUL(a,b) ( (a)*(b) )
+#define C_MUL(m,a,b) \
+    do{ (m).r = (a).r*(b).r - (a).i*(b).i;\
+        (m).i = (a).r*(b).i + (a).i*(b).r; }while(0)
+#   define C_FIXDIV(c,div) /* NOOP */
+#   define C_MULBYSCALAR( c, s ) \
+    do{ (c).r *= (s);\
+        (c).i *= (s); }while(0)
+#endif
+
+#ifndef CHECK_OVERFLOW_OP
+#  define CHECK_OVERFLOW_OP(a,op,b) /* noop */
+#endif
+
+#define  C_ADD( res, a,b)\
+    do { \
+	    CHECK_OVERFLOW_OP((a).r,+,(b).r)\
+	    CHECK_OVERFLOW_OP((a).i,+,(b).i)\
+	    (res).r=(a).r+(b).r;  (res).i=(a).i+(b).i; \
+    }while(0)
+#define  C_SUB( res, a,b)\
+    do { \
+	    CHECK_OVERFLOW_OP((a).r,-,(b).r)\
+	    CHECK_OVERFLOW_OP((a).i,-,(b).i)\
+	    (res).r=(a).r-(b).r;  (res).i=(a).i-(b).i; \
+    }while(0)
+#define C_ADDTO( res , a)\
+    do { \
+	    CHECK_OVERFLOW_OP((res).r,+,(a).r)\
+	    CHECK_OVERFLOW_OP((res).i,+,(a).i)\
+	    (res).r += (a).r;  (res).i += (a).i;\
+    }while(0)
+
+#define C_SUBFROM( res , a)\
+    do {\
+	    CHECK_OVERFLOW_OP((res).r,-,(a).r)\
+	    CHECK_OVERFLOW_OP((res).i,-,(a).i)\
+	    (res).r -= (a).r;  (res).i -= (a).i; \
+    }while(0)
+
+
+#ifdef FIXED_POINT
+#  define KISS_FFT_COS(phase)  floor(.5+SAMP_MAX * cos (phase))
+#  define KISS_FFT_SIN(phase)  floor(.5+SAMP_MAX * sin (phase))
+#  define HALF_OF(x) ((x)>>1)
+#elif defined(USE_SIMD)
+#  define KISS_FFT_COS(phase) _mm_set1_ps( cos(phase) )
+#  define KISS_FFT_SIN(phase) _mm_set1_ps( sin(phase) )
+#  define HALF_OF(x) ((x)*_mm_set1_ps(.5))
+#else
+#  define KISS_FFT_COS(phase) (kiss_fft_scalar) cos(phase)
+#  define KISS_FFT_SIN(phase) (kiss_fft_scalar) sin(phase)
+#  define HALF_OF(x) ((x)*.5)
+#endif
+
+#define  kf_cexp(x,phase) \
+	do{ \
+		(x)->r = KISS_FFT_COS(phase);\
+		(x)->i = KISS_FFT_SIN(phase);\
+	}while(0)
+
+
+/* a debugging function */
+#define pcpx(c)\
+    fprintf(stderr,"%g + %gi\n",(double)((c)->r),(double)((c)->i) )
diff --git a/src/kissfft/kiss_fft.c b/src/kissfft/kiss_fft.c
new file mode 100644
index 0000000..79c9392
--- /dev/null
+++ b/src/kissfft/kiss_fft.c
@@ -0,0 +1,399 @@
+/*
+Copyright (c) 2003-2004, Mark Borgerding
+
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
+    * Neither the author nor the names of any contributors may be used to endorse or promote products derived from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+
+#include "_kiss_fft_guts.h"
+/* The guts header contains all the multiplication and addition macros that are defined for
+ fixed or floating point complex numbers.  It also delares the kf_ internal functions.
+ */
+
+static kiss_fft_cpx *scratchbuf=NULL;
+static size_t nscratchbuf=0;
+static kiss_fft_cpx *tmpbuf=NULL;
+static size_t ntmpbuf=0;
+
+#define CHECKBUF(buf,nbuf,n) \
+    do { \
+        if ( nbuf < (size_t)(n) ) {\
+            free(buf); \
+            buf = (kiss_fft_cpx*)KISS_FFT_MALLOC(sizeof(kiss_fft_cpx)*(n)); \
+            nbuf = (size_t)(n); \
+        } \
+   }while(0)
+
+
+static void kf_bfly2(
+        kiss_fft_cpx * Fout,
+        const size_t fstride,
+        const kiss_fft_cfg st,
+        int m
+        )
+{
+    kiss_fft_cpx * Fout2;
+    kiss_fft_cpx * tw1 = st->twiddles;
+    kiss_fft_cpx t;
+    Fout2 = Fout + m;
+    do{
+        C_FIXDIV(*Fout,2); C_FIXDIV(*Fout2,2);
+
+        C_MUL (t,  *Fout2 , *tw1);
+        tw1 += fstride;
+        C_SUB( *Fout2 ,  *Fout , t );
+        C_ADDTO( *Fout ,  t );
+        ++Fout2;
+        ++Fout;
+    }while (--m);
+}
+
+static void kf_bfly4(
+        kiss_fft_cpx * Fout,
+        const size_t fstride,
+        const kiss_fft_cfg st,
+        const size_t m
+        )
+{
+    kiss_fft_cpx *tw1,*tw2,*tw3;
+    kiss_fft_cpx scratch[6];
+    size_t k=m;
+    const size_t m2=2*m;
+    const size_t m3=3*m;
+
+    tw3 = tw2 = tw1 = st->twiddles;
+
+    do {
+        C_FIXDIV(*Fout,4); C_FIXDIV(Fout[m],4); C_FIXDIV(Fout[m2],4); C_FIXDIV(Fout[m3],4);
+
+        C_MUL(scratch[0],Fout[m] , *tw1 );
+        C_MUL(scratch[1],Fout[m2] , *tw2 );
+        C_MUL(scratch[2],Fout[m3] , *tw3 );
+
+        C_SUB( scratch[5] , *Fout, scratch[1] );
+        C_ADDTO(*Fout, scratch[1]);
+        C_ADD( scratch[3] , scratch[0] , scratch[2] );
+        C_SUB( scratch[4] , scratch[0] , scratch[2] );
+        C_SUB( Fout[m2], *Fout, scratch[3] );
+        tw1 += fstride;
+        tw2 += fstride*2;
+        tw3 += fstride*3;
+        C_ADDTO( *Fout , scratch[3] );
+
+        if(st->inverse) {
+            Fout[m].r = scratch[5].r - scratch[4].i;
+            Fout[m].i = scratch[5].i + scratch[4].r;
+            Fout[m3].r = scratch[5].r + scratch[4].i;
+            Fout[m3].i = scratch[5].i - scratch[4].r;
+        }else{
+            Fout[m].r = scratch[5].r + scratch[4].i;
+            Fout[m].i = scratch[5].i - scratch[4].r;
+            Fout[m3].r = scratch[5].r - scratch[4].i;
+            Fout[m3].i = scratch[5].i + scratch[4].r;
+        }
+        ++Fout;
+    }while(--k);
+}
+
+static void kf_bfly3(
+         kiss_fft_cpx * Fout,
+         const size_t fstride,
+         const kiss_fft_cfg st,
+         size_t m
+         )
+{
+     size_t k=m;
+     const size_t m2 = 2*m;
+     kiss_fft_cpx *tw1,*tw2;
+     kiss_fft_cpx scratch[5];
+     kiss_fft_cpx epi3;
+     epi3 = st->twiddles[fstride*m];
+
+     tw1=tw2=st->twiddles;
+
+     do{
+         C_FIXDIV(*Fout,3); C_FIXDIV(Fout[m],3); C_FIXDIV(Fout[m2],3);
+
+         C_MUL(scratch[1],Fout[m] , *tw1);
+         C_MUL(scratch[2],Fout[m2] , *tw2);
+
+         C_ADD(scratch[3],scratch[1],scratch[2]);
+         C_SUB(scratch[0],scratch[1],scratch[2]);
+         tw1 += fstride;
+         tw2 += fstride*2;
+
+         Fout[m].r = Fout->r - HALF_OF(scratch[3].r);
+         Fout[m].i = Fout->i - HALF_OF(scratch[3].i);
+
+         C_MULBYSCALAR( scratch[0] , epi3.i );
+
+         C_ADDTO(*Fout,scratch[3]);
+
+         Fout[m2].r = Fout[m].r + scratch[0].i;
+         Fout[m2].i = Fout[m].i - scratch[0].r;
+
+         Fout[m].r -= scratch[0].i;
+         Fout[m].i += scratch[0].r;
+
+         ++Fout;
+     }while(--k);
+}
+
+static void kf_bfly5(
+        kiss_fft_cpx * Fout,
+        const size_t fstride,
+        const kiss_fft_cfg st,
+        int m
+        )
+{
+    kiss_fft_cpx *Fout0,*Fout1,*Fout2,*Fout3,*Fout4;
+    int u;
+    kiss_fft_cpx scratch[13];
+    kiss_fft_cpx * twiddles = st->twiddles;
+    kiss_fft_cpx *tw;
+    kiss_fft_cpx ya,yb;
+    ya = twiddles[fstride*m];
+    yb = twiddles[fstride*2*m];
+
+    Fout0=Fout;
+    Fout1=Fout0+m;
+    Fout2=Fout0+2*m;
+    Fout3=Fout0+3*m;
+    Fout4=Fout0+4*m;
+
+    tw=st->twiddles;
+    for ( u=0; u<m; ++u ) {
+        C_FIXDIV( *Fout0,5); C_FIXDIV( *Fout1,5); C_FIXDIV( *Fout2,5); C_FIXDIV( *Fout3,5); C_FIXDIV( *Fout4,5);
+        scratch[0] = *Fout0;
+
+        C_MUL(scratch[1] ,*Fout1, tw[u*fstride]);
+        C_MUL(scratch[2] ,*Fout2, tw[2*u*fstride]);
+        C_MUL(scratch[3] ,*Fout3, tw[3*u*fstride]);
+        C_MUL(scratch[4] ,*Fout4, tw[4*u*fstride]);
+
+        C_ADD( scratch[7],scratch[1],scratch[4]);
+        C_SUB( scratch[10],scratch[1],scratch[4]);
+        C_ADD( scratch[8],scratch[2],scratch[3]);
+        C_SUB( scratch[9],scratch[2],scratch[3]);
+
+        Fout0->r += scratch[7].r + scratch[8].r;
+        Fout0->i += scratch[7].i + scratch[8].i;
+
+        scratch[5].r = scratch[0].r + S_MUL(scratch[7].r,ya.r) + S_MUL(scratch[8].r,yb.r);
+        scratch[5].i = scratch[0].i + S_MUL(scratch[7].i,ya.r) + S_MUL(scratch[8].i,yb.r);
+
+        scratch[6].r =  S_MUL(scratch[10].i,ya.i) + S_MUL(scratch[9].i,yb.i);
+        scratch[6].i = -S_MUL(scratch[10].r,ya.i) - S_MUL(scratch[9].r,yb.i);
+
+        C_SUB(*Fout1,scratch[5],scratch[6]);
+        C_ADD(*Fout4,scratch[5],scratch[6]);
+
+        scratch[11].r = scratch[0].r + S_MUL(scratch[7].r,yb.r) + S_MUL(scratch[8].r,ya.r);
+        scratch[11].i = scratch[0].i + S_MUL(scratch[7].i,yb.r) + S_MUL(scratch[8].i,ya.r);
+        scratch[12].r = - S_MUL(scratch[10].i,yb.i) + S_MUL(scratch[9].i,ya.i);
+        scratch[12].i = S_MUL(scratch[10].r,yb.i) - S_MUL(scratch[9].r,ya.i);
+
+        C_ADD(*Fout2,scratch[11],scratch[12]);
+        C_SUB(*Fout3,scratch[11],scratch[12]);
+
+        ++Fout0;++Fout1;++Fout2;++Fout3;++Fout4;
+    }
+}
+
+/* perform the butterfly for one stage of a mixed radix FFT */
+static void kf_bfly_generic(
+        kiss_fft_cpx * Fout,
+        const size_t fstride,
+        const kiss_fft_cfg st,
+        int m,
+        int p
+        )
+{
+    int u,k,q1,q;
+    kiss_fft_cpx * twiddles = st->twiddles;
+    kiss_fft_cpx t;
+    int Norig = st->nfft;
+
+    CHECKBUF(scratchbuf,nscratchbuf,p);
+
+    for ( u=0; u<m; ++u ) {
+        k=u;
+        for ( q1=0 ; q1<p ; ++q1 ) {
+            scratchbuf[q1] = Fout[ k  ];
+            C_FIXDIV(scratchbuf[q1],p);
+            k += m;
+        }
+
+        k=u;
+        for ( q1=0 ; q1<p ; ++q1 ) {
+            int twidx=0;
+            Fout[ k ] = scratchbuf[0];
+            for (q=1;q<p;++q ) {
+                twidx += fstride * k;
+                if (twidx>=Norig) twidx-=Norig;
+                C_MUL(t,scratchbuf[q] , twiddles[twidx] );
+                C_ADDTO( Fout[ k ] ,t);
+            }
+            k += m;
+        }
+    }
+}
+
+static
+void kf_work(
+        kiss_fft_cpx * Fout,
+        const kiss_fft_cpx * f,
+        const size_t fstride,
+        int in_stride,
+        int * factors,
+        const kiss_fft_cfg st
+        )
+{
+    kiss_fft_cpx * Fout_beg=Fout;
+    const int p=*factors++; /* the radix  */
+    const int m=*factors++; /* stage's fft length/p */
+    const kiss_fft_cpx * Fout_end = Fout + p*m;
+
+    if (m==1) {
+        do{
+            *Fout = *f;
+            f += fstride*in_stride;
+        }while(++Fout != Fout_end );
+    }else{
+        do{
+            kf_work( Fout , f, fstride*p, in_stride, factors,st);
+            f += fstride*in_stride;
+        }while( (Fout += m) != Fout_end );
+    }
+
+    Fout=Fout_beg;
+
+    switch (p) {
+        case 2: kf_bfly2(Fout,fstride,st,m); break;
+        case 3: kf_bfly3(Fout,fstride,st,m); break; 
+        case 4: kf_bfly4(Fout,fstride,st,m); break;
+        case 5: kf_bfly5(Fout,fstride,st,m); break; 
+        default: kf_bfly_generic(Fout,fstride,st,m,p); break;
+    }
+}
+
+/*  facbuf is populated by p1,m1,p2,m2, ...
+    where 
+    p[i] * m[i] = m[i-1]
+    m0 = n                  */
+static 
+void kf_factor(int n,int * facbuf)
+{
+    int p=4;
+    double floor_sqrt;
+    floor_sqrt = floor( sqrt((double)n) );
+
+    /*factor out powers of 4, powers of 2, then any remaining primes */
+    do {
+        while (n % p) {
+            switch (p) {
+                case 4: p = 2; break;
+                case 2: p = 3; break;
+                default: p += 2; break;
+            }
+            if (p > floor_sqrt)
+                p = n;          /* no more factors, skip to end */
+        }
+        n /= p;
+        *facbuf++ = p;
+        *facbuf++ = n;
+    } while (n > 1);
+}
+
+/*
+ *
+ * User-callable function to allocate all necessary storage space for the fft.
+ *
+ * The return value is a contiguous block of memory, allocated with malloc.  As such,
+ * It can be freed with free(), rather than a kiss_fft-specific function.
+ * */
+kiss_fft_cfg kiss_fft_alloc(int nfft,int inverse_fft,void * mem,size_t * lenmem )
+{
+    kiss_fft_cfg st=NULL;
+    size_t memneeded = sizeof(struct kiss_fft_state)
+        + sizeof(kiss_fft_cpx)*(nfft-1); /* twiddle factors*/
+
+    if ( lenmem==NULL ) {
+        st = ( kiss_fft_cfg)KISS_FFT_MALLOC( memneeded );
+    }else{
+        if (mem != NULL && *lenmem >= memneeded)
+            st = (kiss_fft_cfg)mem;
+        *lenmem = memneeded;
+    }
+    if (st) {
+        int i;
+        st->nfft=nfft;
+        st->inverse = inverse_fft;
+
+        for (i=0;i<nfft;++i) {
+            const double pi=3.141592653589793238462643383279502884197169399375105820974944;
+            double phase = -2*pi*i / nfft;
+            if (st->inverse)
+                phase *= -1;
+            kf_cexp(st->twiddles+i, phase );
+        }
+
+        kf_factor(nfft,st->factors);
+    }
+    return st;
+}
+
+
+
+    
+void kiss_fft_stride(kiss_fft_cfg st,const kiss_fft_cpx *fin,kiss_fft_cpx *fout,int in_stride)
+{
+    if (fin == fout) {
+        CHECKBUF(tmpbuf,ntmpbuf,st->nfft);
+        kf_work(tmpbuf,fin,1,in_stride, st->factors,st);
+        memcpy(fout,tmpbuf,sizeof(kiss_fft_cpx)*st->nfft);
+    }else{
+        kf_work( fout, fin, 1,in_stride, st->factors,st );
+    }
+}
+
+void kiss_fft(kiss_fft_cfg cfg,const kiss_fft_cpx *fin,kiss_fft_cpx *fout)
+{
+    kiss_fft_stride(cfg,fin,fout,1);
+}
+
+
+/* not really necessary to call, but if someone is doing in-place ffts, they may want to free the 
+   buffers from CHECKBUF
+ */ 
+void kiss_fft_cleanup(void)
+{
+    free(scratchbuf);
+    scratchbuf = NULL;
+    nscratchbuf=0;
+    free(tmpbuf);
+    tmpbuf=NULL;
+    ntmpbuf=0;
+}
+
+int kiss_fft_next_fast_size(int n)
+{
+    while(1) {
+        int m=n;
+        while ( (m%2) == 0 ) m/=2;
+        while ( (m%3) == 0 ) m/=3;
+        while ( (m%5) == 0 ) m/=5;
+        if (m<=1)
+            break; /* n is completely factorable by twos, threes, and fives */
+        n++;
+    }
+    return n;
+}
diff --git a/src/kissfft/kiss_fft.h b/src/kissfft/kiss_fft.h
new file mode 100644
index 0000000..f8e523e
--- /dev/null
+++ b/src/kissfft/kiss_fft.h
@@ -0,0 +1,121 @@
+#ifndef KISS_FFT_H
+#define KISS_FFT_H
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <math.h>
+#include <memory.h>
+#ifndef __APPLE__
+#include <malloc.h>
+#endif
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+ ATTENTION!
+ If you would like a :
+ -- a utility that will handle the caching of fft objects
+ -- real-only (no imaginary time component ) FFT
+ -- a multi-dimensional FFT
+ -- a command-line utility to perform ffts
+ -- a command-line utility to perform fast-convolution filtering
+
+ Then see kfc.h kiss_fftr.h kiss_fftnd.h fftutil.c kiss_fastfir.c
+  in the tools/ directory.
+*/
+
+#ifdef USE_SIMD
+# include <xmmintrin.h>
+# define kiss_fft_scalar __m128
+#define KISS_FFT_MALLOC(nbytes) memalign(16,nbytes)
+#else	
+#define KISS_FFT_MALLOC malloc
+#endif	
+
+
+#ifdef FIXED_POINT
+#include <sys/types.h>	
+# if (FIXED_POINT == 32)
+#  define kiss_fft_scalar int32_t
+# else	
+#  define kiss_fft_scalar int16_t
+# endif
+#else
+# ifndef kiss_fft_scalar
+/*  default is float */
+#   define kiss_fft_scalar float
+# endif
+#endif
+
+typedef struct {
+    kiss_fft_scalar r;
+    kiss_fft_scalar i;
+}kiss_fft_cpx;
+
+typedef struct kiss_fft_state* kiss_fft_cfg;
+
+/* 
+ *  kiss_fft_alloc
+ *  
+ *  Initialize a FFT (or IFFT) algorithm's cfg/state buffer.
+ *
+ *  typical usage:      kiss_fft_cfg mycfg=kiss_fft_alloc(1024,0,NULL,NULL);
+ *
+ *  The return value from fft_alloc is a cfg buffer used internally
+ *  by the fft routine or NULL.
+ *
+ *  If lenmem is NULL, then kiss_fft_alloc will allocate a cfg buffer using malloc.
+ *  The returned value should be free()d when done to avoid memory leaks.
+ *  
+ *  The state can be placed in a user supplied buffer 'mem':
+ *  If lenmem is not NULL and mem is not NULL and *lenmem is large enough,
+ *      then the function places the cfg in mem and the size used in *lenmem
+ *      and returns mem.
+ *  
+ *  If lenmem is not NULL and ( mem is NULL or *lenmem is not large enough),
+ *      then the function returns NULL and places the minimum cfg 
+ *      buffer size in *lenmem.
+ * */
+
+kiss_fft_cfg kiss_fft_alloc(int nfft,int inverse_fft,void * mem,size_t * lenmem); 
+
+/*
+ * kiss_fft(cfg,in_out_buf)
+ *
+ * Perform an FFT on a complex input buffer.
+ * for a forward FFT,
+ * fin should be  f[0] , f[1] , ... ,f[nfft-1]
+ * fout will be   F[0] , F[1] , ... ,F[nfft-1]
+ * Note that each element is complex and can be accessed like
+    f[k].r and f[k].i
+ * */
+void kiss_fft(kiss_fft_cfg cfg,const kiss_fft_cpx *fin,kiss_fft_cpx *fout);
+
+/*
+ A more generic version of the above function. It reads its input from every Nth sample.
+ * */
+void kiss_fft_stride(kiss_fft_cfg cfg,const kiss_fft_cpx *fin,kiss_fft_cpx *fout,int fin_stride);
+
+/* If kiss_fft_alloc allocated a buffer, it is one contiguous 
+   buffer and can be simply free()d when no longer needed*/
+#define kiss_fft_free free
+
+/*
+ Cleans up some memory that gets managed internally. Not necessary to call, but it might clean up 
+ your compiler output to call this before you exit.
+*/
+void kiss_fft_cleanup(void);
+	
+
+/*
+ * Returns the smallest integer k, such that k>=n and k has only "fast" factors (2,3,5)
+ */
+int kiss_fft_next_fast_size(int n);
+
+#ifdef __cplusplus
+} 
+#endif
+
+#endif
diff --git a/src/kissfft/kiss_fftr.c b/src/kissfft/kiss_fftr.c
new file mode 100644
index 0000000..5bc669d
--- /dev/null
+++ b/src/kissfft/kiss_fftr.c
@@ -0,0 +1,159 @@
+/*
+Copyright (c) 2003-2004, Mark Borgerding
+
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
+    * Neither the author nor the names of any contributors may be used to endorse or promote products derived from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#include "kiss_fftr.h"
+#include "_kiss_fft_guts.h"
+
+struct kiss_fftr_state{
+    kiss_fft_cfg substate;
+    kiss_fft_cpx * tmpbuf;
+    kiss_fft_cpx * super_twiddles;
+#ifdef USE_SIMD    
+    long pad;
+#endif    
+};
+
+kiss_fftr_cfg kiss_fftr_alloc(int nfft,int inverse_fft,void * mem,size_t * lenmem)
+{
+    int i;
+    kiss_fftr_cfg st = NULL;
+    size_t subsize, memneeded;
+
+    if (nfft & 1) {
+        fprintf(stderr,"Real FFT optimization must be even.\n");
+        return NULL;
+    }
+    nfft >>= 1;
+
+    kiss_fft_alloc (nfft, inverse_fft, NULL, &subsize);
+    memneeded = sizeof(struct kiss_fftr_state) + subsize + sizeof(kiss_fft_cpx) * ( nfft * 2);
+
+    if (lenmem == NULL) {
+        st = (kiss_fftr_cfg) KISS_FFT_MALLOC (memneeded);
+    } else {
+        if (*lenmem >= memneeded)
+            st = (kiss_fftr_cfg) mem;
+        *lenmem = memneeded;
+    }
+    if (!st)
+        return NULL;
+
+    st->substate = (kiss_fft_cfg) (st + 1); /*just beyond kiss_fftr_state struct */
+    st->tmpbuf = (kiss_fft_cpx *) (((char *) st->substate) + subsize);
+    st->super_twiddles = st->tmpbuf + nfft;
+    kiss_fft_alloc(nfft, inverse_fft, st->substate, &subsize);
+
+    for (i = 0; i < nfft; ++i) {
+        double phase =
+            -3.14159265358979323846264338327 * ((double) i / nfft + .5);
+        if (inverse_fft)
+            phase *= -1;
+        kf_cexp (st->super_twiddles+i,phase);
+    }
+    return st;
+}
+
+void kiss_fftr(kiss_fftr_cfg st,const kiss_fft_scalar *timedata,kiss_fft_cpx *freqdata)
+{
+    /* input buffer timedata is stored row-wise */
+    int k,ncfft;
+    kiss_fft_cpx fpnk,fpk,f1k,f2k,tw,tdc;
+
+    if ( st->substate->inverse) {
+        fprintf(stderr,"kiss fft usage error: improper alloc\n");
+        exit(1);
+    }
+
+    ncfft = st->substate->nfft;
+
+    /*perform the parallel fft of two real signals packed in real,imag*/
+    kiss_fft( st->substate , (const kiss_fft_cpx*)timedata, st->tmpbuf );
+    /* The real part of the DC element of the frequency spectrum in st->tmpbuf
+     * contains the sum of the even-numbered elements of the input time sequence
+     * The imag part is the sum of the odd-numbered elements
+     *
+     * The sum of tdc.r and tdc.i is the sum of the input time sequence. 
+     *      yielding DC of input time sequence
+     * The difference of tdc.r - tdc.i is the sum of the input (dot product) [1,-1,1,-1... 
+     *      yielding Nyquist bin of input time sequence
+     */
+ 
+    tdc.r = st->tmpbuf[0].r;
+    tdc.i = st->tmpbuf[0].i;
+    C_FIXDIV(tdc,2);
+    CHECK_OVERFLOW_OP(tdc.r ,+, tdc.i);
+    CHECK_OVERFLOW_OP(tdc.r ,-, tdc.i);
+    freqdata[0].r = tdc.r + tdc.i;
+    freqdata[ncfft].r = tdc.r - tdc.i;
+#ifdef USE_SIMD    
+    freqdata[ncfft].i = freqdata[0].i = _mm_set1_ps(0);
+#else
+    freqdata[ncfft].i = freqdata[0].i = 0;
+#endif
+
+    for ( k=1;k <= ncfft/2 ; ++k ) {
+        fpk    = st->tmpbuf[k]; 
+        fpnk.r =   st->tmpbuf[ncfft-k].r;
+        fpnk.i = - st->tmpbuf[ncfft-k].i;
+        C_FIXDIV(fpk,2);
+        C_FIXDIV(fpnk,2);
+
+        C_ADD( f1k, fpk , fpnk );
+        C_SUB( f2k, fpk , fpnk );
+        C_MUL( tw , f2k , st->super_twiddles[k]);
+
+        freqdata[k].r = HALF_OF(f1k.r + tw.r);
+        freqdata[k].i = HALF_OF(f1k.i + tw.i);
+        freqdata[ncfft-k].r = HALF_OF(f1k.r - tw.r);
+        freqdata[ncfft-k].i = HALF_OF(tw.i - f1k.i);
+    }
+}
+
+void kiss_fftri(kiss_fftr_cfg st,const kiss_fft_cpx *freqdata,kiss_fft_scalar *timedata)
+{
+    /* input buffer timedata is stored row-wise */
+    int k, ncfft;
+
+    if (st->substate->inverse == 0) {
+        fprintf (stderr, "kiss fft usage error: improper alloc\n");
+        exit (1);
+    }
+
+    ncfft = st->substate->nfft;
+
+    st->tmpbuf[0].r = freqdata[0].r + freqdata[ncfft].r;
+    st->tmpbuf[0].i = freqdata[0].r - freqdata[ncfft].r;
+    C_FIXDIV(st->tmpbuf[0],2);
+
+    for (k = 1; k <= ncfft / 2; ++k) {
+        kiss_fft_cpx fk, fnkc, fek, fok, tmp;
+        fk = freqdata[k];
+        fnkc.r = freqdata[ncfft - k].r;
+        fnkc.i = -freqdata[ncfft - k].i;
+        C_FIXDIV( fk , 2 );
+        C_FIXDIV( fnkc , 2 );
+
+        C_ADD (fek, fk, fnkc);
+        C_SUB (tmp, fk, fnkc);
+        C_MUL (fok, tmp, st->super_twiddles[k]);
+        C_ADD (st->tmpbuf[k],     fek, fok);
+        C_SUB (st->tmpbuf[ncfft - k], fek, fok);
+#ifdef USE_SIMD        
+        st->tmpbuf[ncfft - k].i *= _mm_set1_ps(-1.0);
+#else
+        st->tmpbuf[ncfft - k].i *= -1;
+#endif
+    }
+    kiss_fft (st->substate, st->tmpbuf, (kiss_fft_cpx *) timedata);
+}
diff --git a/src/kissfft/kiss_fftr.h b/src/kissfft/kiss_fftr.h
new file mode 100644
index 0000000..72e5a57
--- /dev/null
+++ b/src/kissfft/kiss_fftr.h
@@ -0,0 +1,46 @@
+#ifndef KISS_FTR_H
+#define KISS_FTR_H
+
+#include "kiss_fft.h"
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+    
+/* 
+ 
+ Real optimized version can save about 45% cpu time vs. complex fft of a real seq.
+
+ 
+ 
+ */
+
+typedef struct kiss_fftr_state *kiss_fftr_cfg;
+
+
+kiss_fftr_cfg kiss_fftr_alloc(int nfft,int inverse_fft,void * mem, size_t * lenmem);
+/*
+ nfft must be even
+
+ If you don't care to allocate space, use mem = lenmem = NULL 
+*/
+
+
+void kiss_fftr(kiss_fftr_cfg cfg,const kiss_fft_scalar *timedata,kiss_fft_cpx *freqdata);
+/*
+ input timedata has nfft scalar points
+ output freqdata has nfft/2+1 complex points
+*/
+
+void kiss_fftri(kiss_fftr_cfg cfg,const kiss_fft_cpx *freqdata,kiss_fft_scalar *timedata);
+/*
+ input freqdata has  nfft/2+1 complex points
+ output timedata has nfft scalar points
+*/
+
+#define kiss_fftr_free free
+
+#ifdef __cplusplus
+}
+#endif
+#endif
diff --git a/src/pommier/neon_mathfun.h b/src/pommier/neon_mathfun.h
new file mode 100644
index 0000000..4a77283
--- /dev/null
+++ b/src/pommier/neon_mathfun.h
@@ -0,0 +1,301 @@
+/* NEON implementation of sin, cos, exp and log
+
+   Inspired by Intel Approximate Math library, and based on the
+   corresponding algorithms of the cephes math library
+*/
+
+/* Copyright (C) 2011  Julien Pommier
+
+  This software is provided 'as-is', without any express or implied
+  warranty.  In no event will the authors be held liable for any damages
+  arising from the use of this software.
+
+  Permission is granted to anyone to use this software for any purpose,
+  including commercial applications, and to alter it and redistribute it
+  freely, subject to the following restrictions:
+
+  1. The origin of this software must not be misrepresented; you must not
+     claim that you wrote the original software. If you use this software
+     in a product, an acknowledgment in the product documentation would be
+     appreciated but is not required.
+  2. Altered source versions must be plainly marked as such, and must not be
+     misrepresented as being the original software.
+  3. This notice may not be removed or altered from any source distribution.
+
+  (this is the zlib license)
+*/
+
+#include <arm_neon.h>
+
+typedef float32x4_t v4sf;  // vector of 4 float
+typedef uint32x4_t v4su;  // vector of 4 uint32
+typedef int32x4_t v4si;  // vector of 4 uint32
+
+#define c_inv_mant_mask ~0x7f800000u
+#define c_cephes_SQRTHF 0.707106781186547524
+#define c_cephes_log_p0 7.0376836292E-2
+#define c_cephes_log_p1 - 1.1514610310E-1
+#define c_cephes_log_p2 1.1676998740E-1
+#define c_cephes_log_p3 - 1.2420140846E-1
+#define c_cephes_log_p4 + 1.4249322787E-1
+#define c_cephes_log_p5 - 1.6668057665E-1
+#define c_cephes_log_p6 + 2.0000714765E-1
+#define c_cephes_log_p7 - 2.4999993993E-1
+#define c_cephes_log_p8 + 3.3333331174E-1
+#define c_cephes_log_q1 -2.12194440e-4
+#define c_cephes_log_q2 0.693359375
+
+/* natural logarithm computed for 4 simultaneous float 
+   return NaN for x <= 0
+*/
+v4sf log_ps(v4sf x) {
+  v4sf one = vdupq_n_f32(1);
+
+  x = vmaxq_f32(x, vdupq_n_f32(0)); /* force flush to zero on denormal values */
+  v4su invalid_mask = vcleq_f32(x, vdupq_n_f32(0));
+
+  v4si ux = vreinterpretq_s32_f32(x);
+  
+  v4si emm0 = vshrq_n_s32(ux, 23);
+
+  /* keep only the fractional part */
+  ux = vandq_s32(ux, vdupq_n_s32(c_inv_mant_mask));
+  ux = vorrq_s32(ux, vreinterpretq_s32_f32(vdupq_n_f32(0.5f)));
+  x = vreinterpretq_f32_s32(ux);
+
+  emm0 = vsubq_s32(emm0, vdupq_n_s32(0x7f));
+  v4sf e = vcvtq_f32_s32(emm0);
+
+  e = vaddq_f32(e, one);
+
+  /* part2: 
+     if( x < SQRTHF ) {
+       e -= 1;
+       x = x + x - 1.0;
+     } else { x = x - 1.0; }
+  */
+  v4su mask = vcltq_f32(x, vdupq_n_f32(c_cephes_SQRTHF));
+  v4sf tmp = vreinterpretq_f32_u32(vandq_u32(vreinterpretq_u32_f32(x), mask));
+  x = vsubq_f32(x, one);
+  e = vsubq_f32(e, vreinterpretq_f32_u32(vandq_u32(vreinterpretq_u32_f32(one), mask)));
+  x = vaddq_f32(x, tmp);
+
+  v4sf z = vmulq_f32(x,x);
+
+  v4sf y = vdupq_n_f32(c_cephes_log_p0);
+  y = vmulq_f32(y, x);
+  y = vaddq_f32(y, vdupq_n_f32(c_cephes_log_p1));
+  y = vmulq_f32(y, x);
+  y = vaddq_f32(y, vdupq_n_f32(c_cephes_log_p2));
+  y = vmulq_f32(y, x);
+  y = vaddq_f32(y, vdupq_n_f32(c_cephes_log_p3));
+  y = vmulq_f32(y, x);
+  y = vaddq_f32(y, vdupq_n_f32(c_cephes_log_p4));
+  y = vmulq_f32(y, x);
+  y = vaddq_f32(y, vdupq_n_f32(c_cephes_log_p5));
+  y = vmulq_f32(y, x);
+  y = vaddq_f32(y, vdupq_n_f32(c_cephes_log_p6));
+  y = vmulq_f32(y, x);
+  y = vaddq_f32(y, vdupq_n_f32(c_cephes_log_p7));
+  y = vmulq_f32(y, x);
+  y = vaddq_f32(y, vdupq_n_f32(c_cephes_log_p8));
+  y = vmulq_f32(y, x);
+
+  y = vmulq_f32(y, z);
+  
+
+  tmp = vmulq_f32(e, vdupq_n_f32(c_cephes_log_q1));
+  y = vaddq_f32(y, tmp);
+
+
+  tmp = vmulq_f32(z, vdupq_n_f32(0.5f));
+  y = vsubq_f32(y, tmp);
+
+  tmp = vmulq_f32(e, vdupq_n_f32(c_cephes_log_q2));
+  x = vaddq_f32(x, y);
+  x = vaddq_f32(x, tmp);
+  x = vreinterpretq_f32_u32(vorrq_u32(vreinterpretq_u32_f32(x), invalid_mask)); // negative arg will be NAN
+  return x;
+}
+
+#define c_exp_hi 88.3762626647949f
+#define c_exp_lo -88.3762626647949f
+
+#define c_cephes_LOG2EF 1.44269504088896341
+#define c_cephes_exp_C1 0.693359375
+#define c_cephes_exp_C2 -2.12194440e-4
+
+#define c_cephes_exp_p0 1.9875691500E-4
+#define c_cephes_exp_p1 1.3981999507E-3
+#define c_cephes_exp_p2 8.3334519073E-3
+#define c_cephes_exp_p3 4.1665795894E-2
+#define c_cephes_exp_p4 1.6666665459E-1
+#define c_cephes_exp_p5 5.0000001201E-1
+
+/* exp() computed for 4 float at once */
+v4sf exp_ps(v4sf x) {
+  v4sf tmp, fx;
+
+  v4sf one = vdupq_n_f32(1);
+  x = vminq_f32(x, vdupq_n_f32(c_exp_hi));
+  x = vmaxq_f32(x, vdupq_n_f32(c_exp_lo));
+
+  /* express exp(x) as exp(g + n*log(2)) */
+  fx = vmlaq_f32(vdupq_n_f32(0.5f), x, vdupq_n_f32(c_cephes_LOG2EF));
+
+  /* perform a floorf */
+  tmp = vcvtq_f32_s32(vcvtq_s32_f32(fx));
+
+  /* if greater, substract 1 */
+  v4su mask = vcgtq_f32(tmp, fx);    
+  mask = vandq_u32(mask, vreinterpretq_u32_f32(one));
+
+
+  fx = vsubq_f32(tmp, vreinterpretq_f32_u32(mask));
+
+  tmp = vmulq_f32(fx, vdupq_n_f32(c_cephes_exp_C1));
+  v4sf z = vmulq_f32(fx, vdupq_n_f32(c_cephes_exp_C2));
+  x = vsubq_f32(x, tmp);
+  x = vsubq_f32(x, z);
+
+  static const float32_t cephes_exp_p[6] = { c_cephes_exp_p0, c_cephes_exp_p1, c_cephes_exp_p2, c_cephes_exp_p3, c_cephes_exp_p4, c_cephes_exp_p5 };
+  v4sf y = vld1q_dup_f32(cephes_exp_p+0);
+  v4sf c1 = vld1q_dup_f32(cephes_exp_p+1); 
+  v4sf c2 = vld1q_dup_f32(cephes_exp_p+2); 
+  v4sf c3 = vld1q_dup_f32(cephes_exp_p+3); 
+  v4sf c4 = vld1q_dup_f32(cephes_exp_p+4); 
+  v4sf c5 = vld1q_dup_f32(cephes_exp_p+5);
+
+  y = vmulq_f32(y, x);
+  z = vmulq_f32(x,x);
+  y = vaddq_f32(y, c1);
+  y = vmulq_f32(y, x);
+  y = vaddq_f32(y, c2);
+  y = vmulq_f32(y, x);
+  y = vaddq_f32(y, c3);
+  y = vmulq_f32(y, x);
+  y = vaddq_f32(y, c4);
+  y = vmulq_f32(y, x);
+  y = vaddq_f32(y, c5);
+  
+  y = vmulq_f32(y, z);
+  y = vaddq_f32(y, x);
+  y = vaddq_f32(y, one);
+
+  /* build 2^n */
+  int32x4_t mm;
+  mm = vcvtq_s32_f32(fx);
+  mm = vaddq_s32(mm, vdupq_n_s32(0x7f));
+  mm = vshlq_n_s32(mm, 23);
+  v4sf pow2n = vreinterpretq_f32_s32(mm);
+
+  y = vmulq_f32(y, pow2n);
+  return y;
+}
+
+#define c_minus_cephes_DP1 -0.78515625
+#define c_minus_cephes_DP2 -2.4187564849853515625e-4
+#define c_minus_cephes_DP3 -3.77489497744594108e-8
+#define c_sincof_p0 -1.9515295891E-4
+#define c_sincof_p1  8.3321608736E-3
+#define c_sincof_p2 -1.6666654611E-1
+#define c_coscof_p0  2.443315711809948E-005
+#define c_coscof_p1 -1.388731625493765E-003
+#define c_coscof_p2  4.166664568298827E-002
+#define c_cephes_FOPI 1.27323954473516 // 4 / M_PI
+
+/* evaluation of 4 sines & cosines at once.
+
+   The code is the exact rewriting of the cephes sinf function.
+   Precision is excellent as long as x < 8192 (I did not bother to
+   take into account the special handling they have for greater values
+   -- it does not return garbage for arguments over 8192, though, but
+   the extra precision is missing).
+
+   Note that it is such that sinf((float)M_PI) = 8.74e-8, which is the
+   surprising but correct result.
+
+   Note also that when you compute sin(x), cos(x) is available at
+   almost no extra price so both sin_ps and cos_ps make use of
+   sincos_ps..
+  */
+void sincos_ps(v4sf x, v4sf *ysin, v4sf *ycos) { // any x
+  v4sf xmm1, xmm2, xmm3, y;
+
+  v4su emm2;
+  
+  v4su sign_mask_sin, sign_mask_cos;
+  sign_mask_sin = vcltq_f32(x, vdupq_n_f32(0));
+  x = vabsq_f32(x);
+
+  /* scale by 4/Pi */
+  y = vmulq_f32(x, vdupq_n_f32(c_cephes_FOPI));
+
+  /* store the integer part of y in mm0 */
+  emm2 = vcvtq_u32_f32(y);
+  /* j=(j+1) & (~1) (see the cephes sources) */
+  emm2 = vaddq_u32(emm2, vdupq_n_u32(1));
+  emm2 = vandq_u32(emm2, vdupq_n_u32(~1));
+  y = vcvtq_f32_u32(emm2);
+
+  /* get the polynom selection mask 
+     there is one polynom for 0 <= x <= Pi/4
+     and another one for Pi/4<x<=Pi/2
+
+     Both branches will be computed.
+  */
+  v4su poly_mask = vtstq_u32(emm2, vdupq_n_u32(2));
+  
+  /* The magic pass: "Extended precision modular arithmetic" 
+     x = ((x - y * DP1) - y * DP2) - y * DP3; */
+  xmm1 = vmulq_n_f32(y, c_minus_cephes_DP1);
+  xmm2 = vmulq_n_f32(y, c_minus_cephes_DP2);
+  xmm3 = vmulq_n_f32(y, c_minus_cephes_DP3);
+  x = vaddq_f32(x, xmm1);
+  x = vaddq_f32(x, xmm2);
+  x = vaddq_f32(x, xmm3);
+
+  sign_mask_sin = veorq_u32(sign_mask_sin, vtstq_u32(emm2, vdupq_n_u32(4)));
+  sign_mask_cos = vtstq_u32(vsubq_u32(emm2, vdupq_n_u32(2)), vdupq_n_u32(4));
+
+  /* Evaluate the first polynom  (0 <= x <= Pi/4) in y1, 
+     and the second polynom      (Pi/4 <= x <= 0) in y2 */
+  v4sf z = vmulq_f32(x,x);
+  v4sf y1, y2;
+
+  y1 = vmulq_n_f32(z, c_coscof_p0);
+  y2 = vmulq_n_f32(z, c_sincof_p0);
+  y1 = vaddq_f32(y1, vdupq_n_f32(c_coscof_p1));
+  y2 = vaddq_f32(y2, vdupq_n_f32(c_sincof_p1));
+  y1 = vmulq_f32(y1, z);
+  y2 = vmulq_f32(y2, z);
+  y1 = vaddq_f32(y1, vdupq_n_f32(c_coscof_p2));
+  y2 = vaddq_f32(y2, vdupq_n_f32(c_sincof_p2));
+  y1 = vmulq_f32(y1, z);
+  y2 = vmulq_f32(y2, z);
+  y1 = vmulq_f32(y1, z);
+  y2 = vmulq_f32(y2, x);
+  y1 = vsubq_f32(y1, vmulq_f32(z, vdupq_n_f32(0.5f)));
+  y2 = vaddq_f32(y2, x);
+  y1 = vaddq_f32(y1, vdupq_n_f32(1));
+
+  /* select the correct result from the two polynoms */  
+  v4sf ys = vbslq_f32(poly_mask, y1, y2);
+  v4sf yc = vbslq_f32(poly_mask, y2, y1);
+  *ysin = vbslq_f32(sign_mask_sin, vnegq_f32(ys), ys);
+  *ycos = vbslq_f32(sign_mask_cos, yc, vnegq_f32(yc));
+}
+
+v4sf sin_ps(v4sf x) {
+  v4sf ysin, ycos; 
+  sincos_ps(x, &ysin, &ycos); 
+  return ysin;
+}
+
+v4sf cos_ps(v4sf x) {
+  v4sf ysin, ycos; 
+  sincos_ps(x, &ysin, &ycos); 
+  return ycos;
+}
+
+
diff --git a/src/pommier/sse_mathfun.h b/src/pommier/sse_mathfun.h
new file mode 100644
index 0000000..2a2baac
--- /dev/null
+++ b/src/pommier/sse_mathfun.h
@@ -0,0 +1,766 @@
+
+#ifndef _POMMIER_SSE_MATHFUN_H_
+#define _POMMIER_SSE_MATHFUN_H_
+
+/* SIMD (SSE1+MMX or SSE2) implementation of sin, cos, exp and log
+
+   Inspired by Intel Approximate Math library, and based on the
+   corresponding algorithms of the cephes math library
+
+   The default is to use the SSE1 version. If you define USE_SSE2 the
+   the SSE2 intrinsics will be used in place of the MMX intrinsics. Do
+   not expect any significant performance improvement with SSE2.
+*/
+
+/* Copyright (C) 2007  Julien Pommier
+
+  This software is provided 'as-is', without any express or implied
+  warranty.  In no event will the authors be held liable for any damages
+  arising from the use of this software.
+
+  Permission is granted to anyone to use this software for any purpose,
+  including commercial applications, and to alter it and redistribute it
+  freely, subject to the following restrictions:
+
+  1. The origin of this software must not be misrepresented; you must not
+     claim that you wrote the original software. If you use this software
+     in a product, an acknowledgment in the product documentation would be
+     appreciated but is not required.
+  2. Altered source versions must be plainly marked as such, and must not be
+     misrepresented as being the original software.
+  3. This notice may not be removed or altered from any source distribution.
+
+  (this is the zlib license)
+*/
+
+#include <xmmintrin.h>
+
+/* yes I know, the top of this file is quite ugly */
+
+#ifdef _MSC_VER /* visual c++ */
+# define ALIGN16_BEG __declspec(align(16))
+# define ALIGN16_END 
+#else /* gcc or icc */
+# define ALIGN16_BEG
+# define ALIGN16_END __attribute__((aligned(16)))
+#endif
+
+/* __m128 is ugly to write */
+typedef __m128 v4sf;  // vector of 4 float (sse1)
+
+#ifdef USE_SSE2
+# include <emmintrin.h>
+typedef __m128i v4si; // vector of 4 int (sse2)
+#else
+typedef __m64 v2si;   // vector of 2 int (mmx)
+#endif
+
+/* declare some SSE constants -- why can't I figure a better way to do that? */
+#define _PS_CONST(Name, Val)                                            \
+  static const ALIGN16_BEG float _ps_##Name[4] ALIGN16_END = { Val, Val, Val, Val }
+#define _PI32_CONST(Name, Val)                                            \
+  static const ALIGN16_BEG int _pi32_##Name[4] ALIGN16_END = { Val, Val, Val, Val }
+#define _PS_CONST_TYPE(Name, Type, Val)                                 \
+  static const ALIGN16_BEG Type _ps_##Name[4] ALIGN16_END = { Val, Val, Val, Val }
+
+_PS_CONST(1  , 1.0f);
+_PS_CONST(0p5, 0.5f);
+/* the smallest non denormalized float number */
+_PS_CONST_TYPE(min_norm_pos, int, 0x00800000);
+_PS_CONST_TYPE(mant_mask, int, 0x7f800000);
+_PS_CONST_TYPE(inv_mant_mask, int, ~0x7f800000);
+
+_PS_CONST_TYPE(sign_mask, int, 0x80000000);
+_PS_CONST_TYPE(inv_sign_mask, int, ~0x80000000);
+
+_PI32_CONST(1, 1);
+_PI32_CONST(inv1, ~1);
+_PI32_CONST(2, 2);
+_PI32_CONST(4, 4);
+_PI32_CONST(0x7f, 0x7f);
+
+_PS_CONST(cephes_SQRTHF, 0.707106781186547524);
+_PS_CONST(cephes_log_p0, 7.0376836292E-2);
+_PS_CONST(cephes_log_p1, - 1.1514610310E-1);
+_PS_CONST(cephes_log_p2, 1.1676998740E-1);
+_PS_CONST(cephes_log_p3, - 1.2420140846E-1);
+_PS_CONST(cephes_log_p4, + 1.4249322787E-1);
+_PS_CONST(cephes_log_p5, - 1.6668057665E-1);
+_PS_CONST(cephes_log_p6, + 2.0000714765E-1);
+_PS_CONST(cephes_log_p7, - 2.4999993993E-1);
+_PS_CONST(cephes_log_p8, + 3.3333331174E-1);
+_PS_CONST(cephes_log_q1, -2.12194440e-4);
+_PS_CONST(cephes_log_q2, 0.693359375);
+
+#if defined (__MINGW32__)
+
+/* the ugly part below: many versions of gcc used to be completely buggy with respect to some intrinsics
+   The movehl_ps is fixed in mingw 3.4.5, but I found out that all the _mm_cmp* intrinsics were completely
+   broken on my mingw gcc 3.4.5 ...
+
+   Note that the bug on _mm_cmp* does occur only at -O0 optimization level
+*/
+
+inline __m128 my_movehl_ps(__m128 a, const __m128 b) {
+	asm (
+			"movhlps %2,%0\n\t"
+			: "=x" (a)
+			: "0" (a), "x"(b)
+	    );
+	return a;                                 }
+#warning "redefined _mm_movehl_ps (see gcc bug 21179)"
+#define _mm_movehl_ps my_movehl_ps
+
+inline __m128 my_cmplt_ps(__m128 a, const __m128 b) {
+	asm (
+			"cmpltps %2,%0\n\t"
+			: "=x" (a)
+			: "0" (a), "x"(b)
+	    );
+	return a;               
+                  }
+inline __m128 my_cmpgt_ps(__m128 a, const __m128 b) {
+	asm (
+			"cmpnleps %2,%0\n\t"
+			: "=x" (a)
+			: "0" (a), "x"(b)
+	    );
+	return a;               
+}
+inline __m128 my_cmpeq_ps(__m128 a, const __m128 b) {
+	asm (
+			"cmpeqps %2,%0\n\t"
+			: "=x" (a)
+			: "0" (a), "x"(b)
+	    );
+	return a;               
+}
+#warning "redefined _mm_cmpxx_ps functions..."
+#define _mm_cmplt_ps my_cmplt_ps
+#define _mm_cmpgt_ps my_cmpgt_ps
+#define _mm_cmpeq_ps my_cmpeq_ps
+#endif
+
+#ifndef USE_SSE2
+typedef union xmm_mm_union {
+  __m128 xmm;
+  __m64 mm[2];
+} xmm_mm_union;
+
+#define COPY_XMM_TO_MM(xmm_, mm0_, mm1_) {          \
+    xmm_mm_union u; u.xmm = xmm_;                   \
+    mm0_ = u.mm[0];                                 \
+    mm1_ = u.mm[1];                                 \
+}
+
+#define COPY_MM_TO_XMM(mm0_, mm1_, xmm_) {                         \
+    xmm_mm_union u; u.mm[0]=mm0_; u.mm[1]=mm1_; xmm_ = u.xmm;      \
+  }
+
+#endif // USE_SSE2
+
+/* natural logarithm computed for 4 simultaneous float 
+   return NaN for x <= 0
+*/
+v4sf log_ps(v4sf x) {
+#ifdef USE_SSE2
+  v4si emm0;
+#else
+  v2si mm0, mm1;
+#endif
+  v4sf one = *(v4sf*)_ps_1;
+
+  v4sf invalid_mask = _mm_cmple_ps(x, _mm_setzero_ps());
+
+  x = _mm_max_ps(x, *(v4sf*)_ps_min_norm_pos);  /* cut off denormalized stuff */
+
+#ifndef USE_SSE2
+  /* part 1: x = frexpf(x, &e); */
+  COPY_XMM_TO_MM(x, mm0, mm1);
+  mm0 = _mm_srli_pi32(mm0, 23);
+  mm1 = _mm_srli_pi32(mm1, 23);
+#else
+  emm0 = _mm_srli_epi32(_mm_castps_si128(x), 23);
+#endif
+  /* keep only the fractional part */
+  x = _mm_and_ps(x, *(v4sf*)_ps_inv_mant_mask);
+  x = _mm_or_ps(x, *(v4sf*)_ps_0p5);
+
+#ifndef USE_SSE2
+  /* now e=mm0:mm1 contain the really base-2 exponent */
+  mm0 = _mm_sub_pi32(mm0, *(v2si*)_pi32_0x7f);
+  mm1 = _mm_sub_pi32(mm1, *(v2si*)_pi32_0x7f);
+  v4sf e = _mm_cvtpi32x2_ps(mm0, mm1);
+  _mm_empty(); /* bye bye mmx */
+#else
+  emm0 = _mm_sub_epi32(emm0, *(v4si*)_pi32_0x7f);
+  v4sf e = _mm_cvtepi32_ps(emm0);
+#endif
+
+  e = _mm_add_ps(e, one);
+
+  /* part2: 
+     if( x < SQRTHF ) {
+       e -= 1;
+       x = x + x - 1.0;
+     } else { x = x - 1.0; }
+  */
+  v4sf mask = _mm_cmplt_ps(x, *(v4sf*)_ps_cephes_SQRTHF);
+  v4sf tmp = _mm_and_ps(x, mask);
+  x = _mm_sub_ps(x, one);
+  e = _mm_sub_ps(e, _mm_and_ps(one, mask));
+  x = _mm_add_ps(x, tmp);
+
+
+  v4sf z = _mm_mul_ps(x,x);
+
+  v4sf y = *(v4sf*)_ps_cephes_log_p0;
+  y = _mm_mul_ps(y, x);
+  y = _mm_add_ps(y, *(v4sf*)_ps_cephes_log_p1);
+  y = _mm_mul_ps(y, x);
+  y = _mm_add_ps(y, *(v4sf*)_ps_cephes_log_p2);
+  y = _mm_mul_ps(y, x);
+  y = _mm_add_ps(y, *(v4sf*)_ps_cephes_log_p3);
+  y = _mm_mul_ps(y, x);
+  y = _mm_add_ps(y, *(v4sf*)_ps_cephes_log_p4);
+  y = _mm_mul_ps(y, x);
+  y = _mm_add_ps(y, *(v4sf*)_ps_cephes_log_p5);
+  y = _mm_mul_ps(y, x);
+  y = _mm_add_ps(y, *(v4sf*)_ps_cephes_log_p6);
+  y = _mm_mul_ps(y, x);
+  y = _mm_add_ps(y, *(v4sf*)_ps_cephes_log_p7);
+  y = _mm_mul_ps(y, x);
+  y = _mm_add_ps(y, *(v4sf*)_ps_cephes_log_p8);
+  y = _mm_mul_ps(y, x);
+
+  y = _mm_mul_ps(y, z);
+  
+
+  tmp = _mm_mul_ps(e, *(v4sf*)_ps_cephes_log_q1);
+  y = _mm_add_ps(y, tmp);
+
+
+  tmp = _mm_mul_ps(z, *(v4sf*)_ps_0p5);
+  y = _mm_sub_ps(y, tmp);
+
+  tmp = _mm_mul_ps(e, *(v4sf*)_ps_cephes_log_q2);
+  x = _mm_add_ps(x, y);
+  x = _mm_add_ps(x, tmp);
+  x = _mm_or_ps(x, invalid_mask); // negative arg will be NAN
+  return x;
+}
+
+_PS_CONST(exp_hi,	88.3762626647949f);
+_PS_CONST(exp_lo,	-88.3762626647949f);
+
+_PS_CONST(cephes_LOG2EF, 1.44269504088896341);
+_PS_CONST(cephes_exp_C1, 0.693359375);
+_PS_CONST(cephes_exp_C2, -2.12194440e-4);
+
+_PS_CONST(cephes_exp_p0, 1.9875691500E-4);
+_PS_CONST(cephes_exp_p1, 1.3981999507E-3);
+_PS_CONST(cephes_exp_p2, 8.3334519073E-3);
+_PS_CONST(cephes_exp_p3, 4.1665795894E-2);
+_PS_CONST(cephes_exp_p4, 1.6666665459E-1);
+_PS_CONST(cephes_exp_p5, 5.0000001201E-1);
+
+v4sf exp_ps(v4sf x) {
+  v4sf tmp = _mm_setzero_ps(), fx;
+#ifdef USE_SSE2
+  v4si emm0;
+#else
+  v2si mm0, mm1;
+#endif
+  v4sf one = *(v4sf*)_ps_1;
+
+  x = _mm_min_ps(x, *(v4sf*)_ps_exp_hi);
+  x = _mm_max_ps(x, *(v4sf*)_ps_exp_lo);
+
+  /* express exp(x) as exp(g + n*log(2)) */
+  fx = _mm_mul_ps(x, *(v4sf*)_ps_cephes_LOG2EF);
+  fx = _mm_add_ps(fx, *(v4sf*)_ps_0p5);
+
+  /* how to perform a floorf with SSE: just below */
+#ifndef USE_SSE2
+  /* step 1 : cast to int */
+  tmp = _mm_movehl_ps(tmp, fx);
+  mm0 = _mm_cvttps_pi32(fx);
+  mm1 = _mm_cvttps_pi32(tmp);
+  /* step 2 : cast back to float */
+  tmp = _mm_cvtpi32x2_ps(mm0, mm1);
+#else
+  emm0 = _mm_cvttps_epi32(fx);
+  tmp  = _mm_cvtepi32_ps(emm0);
+#endif
+  /* if greater, substract 1 */
+  v4sf mask = _mm_cmpgt_ps(tmp, fx);    
+  mask = _mm_and_ps(mask, one);
+  fx = _mm_sub_ps(tmp, mask);
+
+  tmp = _mm_mul_ps(fx, *(v4sf*)_ps_cephes_exp_C1);
+  v4sf z = _mm_mul_ps(fx, *(v4sf*)_ps_cephes_exp_C2);
+  x = _mm_sub_ps(x, tmp);
+  x = _mm_sub_ps(x, z);
+
+  z = _mm_mul_ps(x,x);
+  
+  v4sf y = *(v4sf*)_ps_cephes_exp_p0;
+  y = _mm_mul_ps(y, x);
+  y = _mm_add_ps(y, *(v4sf*)_ps_cephes_exp_p1);
+  y = _mm_mul_ps(y, x);
+  y = _mm_add_ps(y, *(v4sf*)_ps_cephes_exp_p2);
+  y = _mm_mul_ps(y, x);
+  y = _mm_add_ps(y, *(v4sf*)_ps_cephes_exp_p3);
+  y = _mm_mul_ps(y, x);
+  y = _mm_add_ps(y, *(v4sf*)_ps_cephes_exp_p4);
+  y = _mm_mul_ps(y, x);
+  y = _mm_add_ps(y, *(v4sf*)_ps_cephes_exp_p5);
+  y = _mm_mul_ps(y, z);
+  y = _mm_add_ps(y, x);
+  y = _mm_add_ps(y, one);
+
+  /* build 2^n */
+#ifndef USE_SSE2
+  z = _mm_movehl_ps(z, fx);
+  mm0 = _mm_cvttps_pi32(fx);
+  mm1 = _mm_cvttps_pi32(z);
+  mm0 = _mm_add_pi32(mm0, *(v2si*)_pi32_0x7f);
+  mm1 = _mm_add_pi32(mm1, *(v2si*)_pi32_0x7f);
+  mm0 = _mm_slli_pi32(mm0, 23); 
+  mm1 = _mm_slli_pi32(mm1, 23);
+  
+  v4sf pow2n; 
+  COPY_MM_TO_XMM(mm0, mm1, pow2n);
+  _mm_empty();
+#else
+  emm0 = _mm_cvttps_epi32(fx);
+  emm0 = _mm_add_epi32(emm0, *(v4si*)_pi32_0x7f);
+  emm0 = _mm_slli_epi32(emm0, 23);
+  v4sf pow2n = _mm_castsi128_ps(emm0);
+#endif
+  y = _mm_mul_ps(y, pow2n);
+  return y;
+}
+
+_PS_CONST(minus_cephes_DP1, -0.78515625);
+_PS_CONST(minus_cephes_DP2, -2.4187564849853515625e-4);
+_PS_CONST(minus_cephes_DP3, -3.77489497744594108e-8);
+_PS_CONST(sincof_p0, -1.9515295891E-4);
+_PS_CONST(sincof_p1,  8.3321608736E-3);
+_PS_CONST(sincof_p2, -1.6666654611E-1);
+_PS_CONST(coscof_p0,  2.443315711809948E-005);
+_PS_CONST(coscof_p1, -1.388731625493765E-003);
+_PS_CONST(coscof_p2,  4.166664568298827E-002);
+_PS_CONST(cephes_FOPI, 1.27323954473516); // 4 / M_PI
+
+
+/* evaluation of 4 sines at onces, using only SSE1+MMX intrinsics so
+   it runs also on old athlons XPs and the pentium III of your grand
+   mother.
+
+   The code is the exact rewriting of the cephes sinf function.
+   Precision is excellent as long as x < 8192 (I did not bother to
+   take into account the special handling they have for greater values
+   -- it does not return garbage for arguments over 8192, though, but
+   the extra precision is missing).
+
+   Note that it is such that sinf((float)M_PI) = 8.74e-8, which is the
+   surprising but correct result.
+
+   Performance is also surprisingly good, 1.33 times faster than the
+   macos vsinf SSE2 function, and 1.5 times faster than the
+   __vrs4_sinf of amd's ACML (which is only available in 64 bits). Not
+   too bad for an SSE1 function (with no special tuning) !
+   However the latter libraries probably have a much better handling of NaN,
+   Inf, denormalized and other special arguments..
+
+   On my core 1 duo, the execution of this function takes approximately 95 cycles.
+
+   From what I have observed on the experiments with Intel AMath lib, switching to an
+   SSE2 version would improve the perf by only 10%.
+
+   Since it is based on SSE intrinsics, it has to be compiled at -O2 to
+   deliver full speed.
+*/
+v4sf sin_ps(v4sf x) { // any x
+  v4sf xmm1, xmm2 = _mm_setzero_ps(), xmm3, sign_bit, y;
+
+#ifdef USE_SSE2
+  v4si emm0, emm2;
+#else
+  v2si mm0, mm1, mm2, mm3;
+#endif
+  sign_bit = x;
+  /* take the absolute value */
+  x = _mm_and_ps(x, *(v4sf*)_ps_inv_sign_mask);
+  /* extract the sign bit (upper one) */
+  sign_bit = _mm_and_ps(sign_bit, *(v4sf*)_ps_sign_mask);
+  
+  /* scale by 4/Pi */
+  y = _mm_mul_ps(x, *(v4sf*)_ps_cephes_FOPI);
+
+  //printf("plop:"); print4(y); 
+#ifdef USE_SSE2
+  /* store the integer part of y in mm0 */
+  emm2 = _mm_cvttps_epi32(y);
+  /* j=(j+1) & (~1) (see the cephes sources) */
+  emm2 = _mm_add_epi32(emm2, *(v4si*)_pi32_1);
+  emm2 = _mm_and_si128(emm2, *(v4si*)_pi32_inv1);
+  y = _mm_cvtepi32_ps(emm2);
+  /* get the swap sign flag */
+  emm0 = _mm_and_si128(emm2, *(v4si*)_pi32_4);
+  emm0 = _mm_slli_epi32(emm0, 29);
+  /* get the polynom selection mask 
+     there is one polynom for 0 <= x <= Pi/4
+     and another one for Pi/4<x<=Pi/2
+
+     Both branches will be computed.
+  */
+  emm2 = _mm_and_si128(emm2, *(v4si*)_pi32_2);
+  emm2 = _mm_cmpeq_epi32(emm2, _mm_setzero_si128());
+  
+  v4sf swap_sign_bit = _mm_castsi128_ps(emm0);
+  v4sf poly_mask = _mm_castsi128_ps(emm2);
+  sign_bit = _mm_xor_ps(sign_bit, swap_sign_bit);
+#else
+  /* store the integer part of y in mm0:mm1 */
+  xmm2 = _mm_movehl_ps(xmm2, y);
+  mm2 = _mm_cvttps_pi32(y);
+  mm3 = _mm_cvttps_pi32(xmm2);
+  /* j=(j+1) & (~1) (see the cephes sources) */
+  mm2 = _mm_add_pi32(mm2, *(v2si*)_pi32_1);
+  mm3 = _mm_add_pi32(mm3, *(v2si*)_pi32_1);
+  mm2 = _mm_and_si64(mm2, *(v2si*)_pi32_inv1);
+  mm3 = _mm_and_si64(mm3, *(v2si*)_pi32_inv1);
+  y = _mm_cvtpi32x2_ps(mm2, mm3);
+  /* get the swap sign flag */
+  mm0 = _mm_and_si64(mm2, *(v2si*)_pi32_4);
+  mm1 = _mm_and_si64(mm3, *(v2si*)_pi32_4);
+  mm0 = _mm_slli_pi32(mm0, 29);
+  mm1 = _mm_slli_pi32(mm1, 29);
+  /* get the polynom selection mask */
+  mm2 = _mm_and_si64(mm2, *(v2si*)_pi32_2);
+  mm3 = _mm_and_si64(mm3, *(v2si*)_pi32_2);
+  mm2 = _mm_cmpeq_pi32(mm2, _mm_setzero_si64());
+  mm3 = _mm_cmpeq_pi32(mm3, _mm_setzero_si64());
+  v4sf swap_sign_bit, poly_mask;
+  COPY_MM_TO_XMM(mm0, mm1, swap_sign_bit);
+  COPY_MM_TO_XMM(mm2, mm3, poly_mask);
+  sign_bit = _mm_xor_ps(sign_bit, swap_sign_bit);
+  _mm_empty(); /* good-bye mmx */
+#endif
+  
+  /* The magic pass: "Extended precision modular arithmetic" 
+     x = ((x - y * DP1) - y * DP2) - y * DP3; */
+  xmm1 = *(v4sf*)_ps_minus_cephes_DP1;
+  xmm2 = *(v4sf*)_ps_minus_cephes_DP2;
+  xmm3 = *(v4sf*)_ps_minus_cephes_DP3;
+  xmm1 = _mm_mul_ps(y, xmm1);
+  xmm2 = _mm_mul_ps(y, xmm2);
+  xmm3 = _mm_mul_ps(y, xmm3);
+  x = _mm_add_ps(x, xmm1);
+  x = _mm_add_ps(x, xmm2);
+  x = _mm_add_ps(x, xmm3);
+
+  /* Evaluate the first polynom  (0 <= x <= Pi/4) */
+  y = *(v4sf*)_ps_coscof_p0;
+  v4sf z = _mm_mul_ps(x,x);
+
+  y = _mm_mul_ps(y, z);
+  y = _mm_add_ps(y, *(v4sf*)_ps_coscof_p1);
+  y = _mm_mul_ps(y, z);
+  y = _mm_add_ps(y, *(v4sf*)_ps_coscof_p2);
+  y = _mm_mul_ps(y, z);
+  y = _mm_mul_ps(y, z);
+  v4sf tmp = _mm_mul_ps(z, *(v4sf*)_ps_0p5);
+  y = _mm_sub_ps(y, tmp);
+  y = _mm_add_ps(y, *(v4sf*)_ps_1);
+  
+  /* Evaluate the second polynom  (Pi/4 <= x <= 0) */
+
+  v4sf y2 = *(v4sf*)_ps_sincof_p0;
+  y2 = _mm_mul_ps(y2, z);
+  y2 = _mm_add_ps(y2, *(v4sf*)_ps_sincof_p1);
+  y2 = _mm_mul_ps(y2, z);
+  y2 = _mm_add_ps(y2, *(v4sf*)_ps_sincof_p2);
+  y2 = _mm_mul_ps(y2, z);
+  y2 = _mm_mul_ps(y2, x);
+  y2 = _mm_add_ps(y2, x);
+
+  /* select the correct result from the two polynoms */  
+  xmm3 = poly_mask;
+  y2 = _mm_and_ps(xmm3, y2); //, xmm3);
+  y = _mm_andnot_ps(xmm3, y);
+  y = _mm_add_ps(y,y2);
+  /* update the sign */
+  y = _mm_xor_ps(y, sign_bit);
+
+  return y;
+}
+
+/* almost the same as sin_ps */
+v4sf cos_ps(v4sf x) { // any x
+  v4sf xmm1, xmm2 = _mm_setzero_ps(), xmm3, y;
+#ifdef USE_SSE2
+  v4si emm0, emm2;
+#else
+  v2si mm0, mm1, mm2, mm3;
+#endif
+  /* take the absolute value */
+  x = _mm_and_ps(x, *(v4sf*)_ps_inv_sign_mask);
+  
+  /* scale by 4/Pi */
+  y = _mm_mul_ps(x, *(v4sf*)_ps_cephes_FOPI);
+  
+#ifdef USE_SSE2
+  /* store the integer part of y in mm0 */
+  emm2 = _mm_cvttps_epi32(y);
+  /* j=(j+1) & (~1) (see the cephes sources) */
+  emm2 = _mm_add_epi32(emm2, *(v4si*)_pi32_1);
+  emm2 = _mm_and_si128(emm2, *(v4si*)_pi32_inv1);
+  y = _mm_cvtepi32_ps(emm2);
+
+  emm2 = _mm_sub_epi32(emm2, *(v4si*)_pi32_2);
+  
+  /* get the swap sign flag */
+  emm0 = _mm_andnot_si128(emm2, *(v4si*)_pi32_4);
+  emm0 = _mm_slli_epi32(emm0, 29);
+  /* get the polynom selection mask */
+  emm2 = _mm_and_si128(emm2, *(v4si*)_pi32_2);
+  emm2 = _mm_cmpeq_epi32(emm2, _mm_setzero_si128());
+  
+  v4sf sign_bit = _mm_castsi128_ps(emm0);
+  v4sf poly_mask = _mm_castsi128_ps(emm2);
+#else
+  /* store the integer part of y in mm0:mm1 */
+  xmm2 = _mm_movehl_ps(xmm2, y);
+  mm2 = _mm_cvttps_pi32(y);
+  mm3 = _mm_cvttps_pi32(xmm2);
+
+  /* j=(j+1) & (~1) (see the cephes sources) */
+  mm2 = _mm_add_pi32(mm2, *(v2si*)_pi32_1);
+  mm3 = _mm_add_pi32(mm3, *(v2si*)_pi32_1);
+  mm2 = _mm_and_si64(mm2, *(v2si*)_pi32_inv1);
+  mm3 = _mm_and_si64(mm3, *(v2si*)_pi32_inv1);
+
+  y = _mm_cvtpi32x2_ps(mm2, mm3);
+
+
+  mm2 = _mm_sub_pi32(mm2, *(v2si*)_pi32_2);
+  mm3 = _mm_sub_pi32(mm3, *(v2si*)_pi32_2);
+
+  /* get the swap sign flag in mm0:mm1 and the 
+     polynom selection mask in mm2:mm3 */
+
+  mm0 = _mm_andnot_si64(mm2, *(v2si*)_pi32_4);
+  mm1 = _mm_andnot_si64(mm3, *(v2si*)_pi32_4);
+  mm0 = _mm_slli_pi32(mm0, 29);
+  mm1 = _mm_slli_pi32(mm1, 29);
+
+  mm2 = _mm_and_si64(mm2, *(v2si*)_pi32_2);
+  mm3 = _mm_and_si64(mm3, *(v2si*)_pi32_2);
+
+  mm2 = _mm_cmpeq_pi32(mm2, _mm_setzero_si64());
+  mm3 = _mm_cmpeq_pi32(mm3, _mm_setzero_si64());
+
+  v4sf sign_bit, poly_mask;
+  COPY_MM_TO_XMM(mm0, mm1, sign_bit);
+  COPY_MM_TO_XMM(mm2, mm3, poly_mask);
+  _mm_empty(); /* good-bye mmx */
+#endif
+  /* The magic pass: "Extended precision modular arithmetic" 
+     x = ((x - y * DP1) - y * DP2) - y * DP3; */
+  xmm1 = *(v4sf*)_ps_minus_cephes_DP1;
+  xmm2 = *(v4sf*)_ps_minus_cephes_DP2;
+  xmm3 = *(v4sf*)_ps_minus_cephes_DP3;
+  xmm1 = _mm_mul_ps(y, xmm1);
+  xmm2 = _mm_mul_ps(y, xmm2);
+  xmm3 = _mm_mul_ps(y, xmm3);
+  x = _mm_add_ps(x, xmm1);
+  x = _mm_add_ps(x, xmm2);
+  x = _mm_add_ps(x, xmm3);
+  
+  /* Evaluate the first polynom  (0 <= x <= Pi/4) */
+  y = *(v4sf*)_ps_coscof_p0;
+  v4sf z = _mm_mul_ps(x,x);
+
+  y = _mm_mul_ps(y, z);
+  y = _mm_add_ps(y, *(v4sf*)_ps_coscof_p1);
+  y = _mm_mul_ps(y, z);
+  y = _mm_add_ps(y, *(v4sf*)_ps_coscof_p2);
+  y = _mm_mul_ps(y, z);
+  y = _mm_mul_ps(y, z);
+  v4sf tmp = _mm_mul_ps(z, *(v4sf*)_ps_0p5);
+  y = _mm_sub_ps(y, tmp);
+  y = _mm_add_ps(y, *(v4sf*)_ps_1);
+  
+  /* Evaluate the second polynom  (Pi/4 <= x <= 0) */
+
+  v4sf y2 = *(v4sf*)_ps_sincof_p0;
+  y2 = _mm_mul_ps(y2, z);
+  y2 = _mm_add_ps(y2, *(v4sf*)_ps_sincof_p1);
+  y2 = _mm_mul_ps(y2, z);
+  y2 = _mm_add_ps(y2, *(v4sf*)_ps_sincof_p2);
+  y2 = _mm_mul_ps(y2, z);
+  y2 = _mm_mul_ps(y2, x);
+  y2 = _mm_add_ps(y2, x);
+
+  /* select the correct result from the two polynoms */  
+  xmm3 = poly_mask;
+  y2 = _mm_and_ps(xmm3, y2); //, xmm3);
+  y = _mm_andnot_ps(xmm3, y);
+  y = _mm_add_ps(y,y2);
+  /* update the sign */
+  y = _mm_xor_ps(y, sign_bit);
+
+  return y;
+}
+
+/* since sin_ps and cos_ps are almost identical, sincos_ps could replace both of them..
+   it is almost as fast, and gives you a free cosine with your sine */
+void sincos_ps(v4sf x, v4sf *s, v4sf *c) {
+  v4sf xmm1, xmm2, xmm3 = _mm_setzero_ps(), sign_bit_sin, y;
+#ifdef USE_SSE2
+  v4si emm0, emm2, emm4;
+#else
+  v2si mm0, mm1, mm2, mm3, mm4, mm5;
+#endif
+  sign_bit_sin = x;
+  /* take the absolute value */
+  x = _mm_and_ps(x, *(v4sf*)_ps_inv_sign_mask);
+  /* extract the sign bit (upper one) */
+  sign_bit_sin = _mm_and_ps(sign_bit_sin, *(v4sf*)_ps_sign_mask);
+  
+  /* scale by 4/Pi */
+  y = _mm_mul_ps(x, *(v4sf*)_ps_cephes_FOPI);
+    
+#ifdef USE_SSE2
+  /* store the integer part of y in emm2 */
+  emm2 = _mm_cvttps_epi32(y);
+
+  /* j=(j+1) & (~1) (see the cephes sources) */
+  emm2 = _mm_add_epi32(emm2, *(v4si*)_pi32_1);
+  emm2 = _mm_and_si128(emm2, *(v4si*)_pi32_inv1);
+  y = _mm_cvtepi32_ps(emm2);
+
+  emm4 = emm2;
+
+  /* get the swap sign flag for the sine */
+  emm0 = _mm_and_si128(emm2, *(v4si*)_pi32_4);
+  emm0 = _mm_slli_epi32(emm0, 29);
+  v4sf swap_sign_bit_sin = _mm_castsi128_ps(emm0);
+
+  /* get the polynom selection mask for the sine*/
+  emm2 = _mm_and_si128(emm2, *(v4si*)_pi32_2);
+  emm2 = _mm_cmpeq_epi32(emm2, _mm_setzero_si128());
+  v4sf poly_mask = _mm_castsi128_ps(emm2);
+#else
+  /* store the integer part of y in mm2:mm3 */
+  xmm3 = _mm_movehl_ps(xmm3, y);
+  mm2 = _mm_cvttps_pi32(y);
+  mm3 = _mm_cvttps_pi32(xmm3);
+
+  /* j=(j+1) & (~1) (see the cephes sources) */
+  mm2 = _mm_add_pi32(mm2, *(v2si*)_pi32_1);
+  mm3 = _mm_add_pi32(mm3, *(v2si*)_pi32_1);
+  mm2 = _mm_and_si64(mm2, *(v2si*)_pi32_inv1);
+  mm3 = _mm_and_si64(mm3, *(v2si*)_pi32_inv1);
+
+  y = _mm_cvtpi32x2_ps(mm2, mm3);
+
+  mm4 = mm2;
+  mm5 = mm3;
+
+  /* get the swap sign flag for the sine */
+  mm0 = _mm_and_si64(mm2, *(v2si*)_pi32_4);
+  mm1 = _mm_and_si64(mm3, *(v2si*)_pi32_4);
+  mm0 = _mm_slli_pi32(mm0, 29);
+  mm1 = _mm_slli_pi32(mm1, 29);
+  v4sf swap_sign_bit_sin;
+  COPY_MM_TO_XMM(mm0, mm1, swap_sign_bit_sin);
+
+  /* get the polynom selection mask for the sine */
+
+  mm2 = _mm_and_si64(mm2, *(v2si*)_pi32_2);
+  mm3 = _mm_and_si64(mm3, *(v2si*)_pi32_2);
+  mm2 = _mm_cmpeq_pi32(mm2, _mm_setzero_si64());
+  mm3 = _mm_cmpeq_pi32(mm3, _mm_setzero_si64());
+  v4sf poly_mask;
+  COPY_MM_TO_XMM(mm2, mm3, poly_mask);
+#endif
+
+  /* The magic pass: "Extended precision modular arithmetic" 
+     x = ((x - y * DP1) - y * DP2) - y * DP3; */
+  xmm1 = *(v4sf*)_ps_minus_cephes_DP1;
+  xmm2 = *(v4sf*)_ps_minus_cephes_DP2;
+  xmm3 = *(v4sf*)_ps_minus_cephes_DP3;
+  xmm1 = _mm_mul_ps(y, xmm1);
+  xmm2 = _mm_mul_ps(y, xmm2);
+  xmm3 = _mm_mul_ps(y, xmm3);
+  x = _mm_add_ps(x, xmm1);
+  x = _mm_add_ps(x, xmm2);
+  x = _mm_add_ps(x, xmm3);
+
+#ifdef USE_SSE2
+  emm4 = _mm_sub_epi32(emm4, *(v4si*)_pi32_2);
+  emm4 = _mm_andnot_si128(emm4, *(v4si*)_pi32_4);
+  emm4 = _mm_slli_epi32(emm4, 29);
+  v4sf sign_bit_cos = _mm_castsi128_ps(emm4);
+#else
+  /* get the sign flag for the cosine */
+  mm4 = _mm_sub_pi32(mm4, *(v2si*)_pi32_2);
+  mm5 = _mm_sub_pi32(mm5, *(v2si*)_pi32_2);
+  mm4 = _mm_andnot_si64(mm4, *(v2si*)_pi32_4);
+  mm5 = _mm_andnot_si64(mm5, *(v2si*)_pi32_4);
+  mm4 = _mm_slli_pi32(mm4, 29);
+  mm5 = _mm_slli_pi32(mm5, 29);
+  v4sf sign_bit_cos;
+  COPY_MM_TO_XMM(mm4, mm5, sign_bit_cos);
+  _mm_empty(); /* good-bye mmx */
+#endif
+
+  sign_bit_sin = _mm_xor_ps(sign_bit_sin, swap_sign_bit_sin);
+
+  
+  /* Evaluate the first polynom  (0 <= x <= Pi/4) */
+  v4sf z = _mm_mul_ps(x,x);
+  y = *(v4sf*)_ps_coscof_p0;
+
+  y = _mm_mul_ps(y, z);
+  y = _mm_add_ps(y, *(v4sf*)_ps_coscof_p1);
+  y = _mm_mul_ps(y, z);
+  y = _mm_add_ps(y, *(v4sf*)_ps_coscof_p2);
+  y = _mm_mul_ps(y, z);
+  y = _mm_mul_ps(y, z);
+  v4sf tmp = _mm_mul_ps(z, *(v4sf*)_ps_0p5);
+  y = _mm_sub_ps(y, tmp);
+  y = _mm_add_ps(y, *(v4sf*)_ps_1);
+  
+  /* Evaluate the second polynom  (Pi/4 <= x <= 0) */
+
+  v4sf y2 = *(v4sf*)_ps_sincof_p0;
+  y2 = _mm_mul_ps(y2, z);
+  y2 = _mm_add_ps(y2, *(v4sf*)_ps_sincof_p1);
+  y2 = _mm_mul_ps(y2, z);
+  y2 = _mm_add_ps(y2, *(v4sf*)_ps_sincof_p2);
+  y2 = _mm_mul_ps(y2, z);
+  y2 = _mm_mul_ps(y2, x);
+  y2 = _mm_add_ps(y2, x);
+
+  /* select the correct result from the two polynoms */  
+  xmm3 = poly_mask;
+  v4sf ysin2 = _mm_and_ps(xmm3, y2);
+  v4sf ysin1 = _mm_andnot_ps(xmm3, y);
+  y2 = _mm_sub_ps(y2,ysin2);
+  y = _mm_sub_ps(y, ysin1);
+
+  xmm1 = _mm_add_ps(ysin1,ysin2);
+  xmm2 = _mm_add_ps(y,y2);
+ 
+  /* update the sign */
+  *s = _mm_xor_ps(xmm1, sign_bit_sin);
+  *c = _mm_xor_ps(xmm2, sign_bit_cos);
+}
+
+#endif
+
diff --git a/src/rubberband-c.cpp b/src/rubberband-c.cpp
index d5bf2aa..3bb4fc5 100644
--- a/src/rubberband-c.cpp
+++ b/src/rubberband-c.cpp
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include "rubberband/rubberband-c.h"
diff --git a/src/speex/COPYING b/src/speex/COPYING
new file mode 100644
index 0000000..9bda595
--- /dev/null
+++ b/src/speex/COPYING
@@ -0,0 +1,35 @@
+Copyright 2002-2007 	Xiph.org Foundation
+Copyright 2002-2007 	Jean-Marc Valin
+Copyright 2005-2007	Analog Devices Inc.
+Copyright 2005-2007	Commonwealth Scientific and Industrial Research 
+                        Organisation (CSIRO)
+Copyright 1993, 2002, 2006 David Rowe
+Copyright 2003 		EpicGames
+Copyright 1992-1994	Jutta Degener, Carsten Bormann
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+
+- Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+
+- Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in the
+documentation and/or other materials provided with the distribution.
+
+- Neither the name of the Xiph.org Foundation nor the names of its
+contributors may be used to endorse or promote products derived from
+this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE FOUNDATION OR
+CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
diff --git a/src/speex/resample.c b/src/speex/resample.c
new file mode 100644
index 0000000..6ee8a53
--- /dev/null
+++ b/src/speex/resample.c
@@ -0,0 +1,1264 @@
+/* -*- c-basic-offset: 4 indent-tabs-mode: nil -*- vi:set ts=8 sts=4 sw=4: */
+
+/* Copyright (C) 2007 Jean-Marc Valin
+
+   File: resample.c
+   Arbitrary resampling code
+
+   Redistribution and use in source and binary forms, with or without
+   modification, are permitted provided that the following conditions are
+   met:
+
+   1. Redistributions of source code must retain the above copyright notice,
+   this list of conditions and the following disclaimer.
+
+   2. Redistributions in binary form must reproduce the above copyright
+   notice, this list of conditions and the following disclaimer in the
+   documentation and/or other materials provided with the distribution.
+
+   3. The name of the author may not be used to endorse or promote products
+   derived from this software without specific prior written permission.
+
+   THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
+   IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+   OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+   DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT,
+   INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+   (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+   SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+   HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+   STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
+   ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+   POSSIBILITY OF SUCH DAMAGE.
+*/
+
+/*
+   The design goals of this code are:
+      - Very fast algorithm
+      - SIMD-friendly algorithm
+      - Low memory requirement
+      - Good *perceptual* quality (and not best SNR)
+
+   Warning: This resampler is relatively new. Although I think I got rid of
+   all the major bugs and I don't expect the API to change anymore, there
+   may be something I've missed. So use with caution.
+
+   This algorithm is based on this original resampling algorithm:
+   Smith, Julius O. Digital Audio Resampling Home Page
+   Center for Computer Research in Music and Acoustics (CCRMA),
+   Stanford University, 2007.
+   Web published at http://www-ccrma.stanford.edu/~jos/resample/.
+
+   There is one main difference, though. This resampler uses cubic
+   interpolation instead of linear interpolation in the above paper. This
+   makes the table much smaller and makes it possible to compute that table
+   on a per-stream basis. In turn, being able to tweak the table for each
+   stream makes it possible to both reduce complexity on simple ratios
+   (e.g. 2/3), and get rid of the rounding operations in the inner loop.
+   The latter both reduces CPU time and makes the algorithm more SIMD-friendly.
+*/
+
+/*
+   NOTE: This code has been cut down and reformatted by Chris Cannam
+   for personal reading preference, and for use in the Rubber Band
+   time stretching and pitch shifting library.  If you have problems
+   with this code, cast suspicion on the butchering it has undergone;
+   it's probably my fault.  If you want a properly functioning
+   version, please go for the original Speex code first.  I haven't
+   made any substantial changes to this code, I've just made it less
+   generally useful.
+*/
+
+#include <stdlib.h>
+#include <stdio.h>
+
+#include <string.h>
+
+#ifdef HAVE_IPP
+#include <ipps.h>
+#endif
+
+// Simple allocators with a fixed minimum, to avoid reallocation if
+// the size changes but remains smaller than that.  The system alloc
+// functions no doubt do exactly the same thing for some value
+// probably not too distant from ours, but we want the certainty.
+
+#define ALLOC_MINIMUM 4096
+
+static void *speex_alloc (int count, int size)
+{
+#ifdef HAVE_IPP
+    void *rv;
+#endif
+
+//	fprintf(stderr, "speex_alloc(%d,%d)\n", count, size);
+	if (count * size < ALLOC_MINIMUM) {
+//		fprintf(stderr, "upgrading count from %d to %d\n", count, ALLOC_MINIMUM / size);
+        count = ALLOC_MINIMUM / size;
+    }
+
+#ifdef HAVE_IPP
+    if (size == sizeof(float) && size == 4) { // or sizeof(int32) or whatever, doesn't matter
+        rv = ippsMalloc_32f(count);
+    } else if (size == sizeof(double) && size == 8) {
+        rv = ippsMalloc_64f(count);
+    } else {
+        rv = ippsMalloc_8u(count * size);
+    }
+//	fprintf(stderr, "allocated at %p; now setting %d bytes to zero\n", rv, count*size);
+    memset(rv, count * size, 0);
+//	fprintf(stderr, "returning %p\n",rv);
+    return rv;
+#else
+    return calloc(count, size);
+#endif
+}
+
+static void speex_free (void *ptr) 
+{
+//	fprintf(stderr,"speex_free(%p)\n", ptr);
+#ifdef HAVE_IPP
+  	ippsFree(ptr);
+#else
+    free(ptr);
+#endif
+}
+
+static void *speex_realloc (void *ptr, int oldcount, int newcount, int size)
+{
+#ifdef HAVE_IPP
+	void *newptr;
+#endif
+
+//	fprintf(stderr,"speex_realloc(%p,%d,%d,%d)\n", ptr, oldcount, newcount, size);
+
+    if (newcount * size < ALLOC_MINIMUM) {
+//		fprintf(stderr,"returning %p\n",ptr);
+        return ptr;
+    }
+//    fprintf(stderr, "NOTE: speex_realloc: actual reallocation happening (newcount = %d, size = %d)\n", newcount, size);
+
+#ifdef HAVE_IPP
+    newptr = speex_alloc(newcount, size);
+    if (ptr && oldcount > 0) {
+        int copy = newcount;
+        if (oldcount < copy) copy = oldcount;
+        memcpy(newptr, ptr, copy * size);
+    }
+    speex_free(ptr);
+//	fprintf(stderr,"returning %p\n", ptr);
+    return newptr;
+#else
+    return realloc(ptr, newcount * size);
+#endif
+}
+
+#include "speex_resampler.h"
+
+#include <math.h>
+
+#ifndef M_PI
+#define M_PI 3.14159263
+#endif
+
+#define FILTER_SIZE 64
+#define OVERSAMPLE 8
+
+#define IMAX(a,b) ((a) > (b) ? (a) : (b))
+#define IMIN(a,b) ((a) < (b) ? (a) : (b))
+
+#ifndef NULL
+#define NULL 0
+#endif
+
+
+typedef int (*resampler_basic_func)(SpeexResamplerState *, spx_uint32_t , const float *, spx_uint32_t *, float *, spx_uint32_t *);
+
+struct SpeexResamplerState_ {
+    spx_uint32_t in_rate;
+    spx_uint32_t out_rate;
+    spx_uint32_t num_rate;
+    spx_uint32_t den_rate;
+
+    int    quality;
+    spx_uint32_t nb_channels;
+    spx_uint32_t filt_len;
+    spx_uint32_t mem_alloc_size;
+    int          int_advance;
+    int          frac_advance;
+    float  cutoff;
+    spx_uint32_t oversample;
+    int          initialised;
+    int          started;
+
+    /* These are per-channel */
+    spx_int32_t  *last_sample;
+    spx_uint32_t *samp_frac_num;
+    spx_uint32_t *magic_samples;
+
+    float *mem;
+    float *sinc_table;
+    spx_uint32_t sinc_table_length;
+    spx_uint32_t sinc_table_alloc;
+    resampler_basic_func resampler_ptr;
+
+    int    in_stride;
+    int    out_stride;
+} ;
+
+static double kaiser12_table[68] = {
+    0.99859849, 1.00000000, 0.99859849, 0.99440475, 0.98745105, 0.97779076,
+    0.96549770, 0.95066529, 0.93340547, 0.91384741, 0.89213598, 0.86843014,
+    0.84290116, 0.81573067, 0.78710866, 0.75723148, 0.72629970, 0.69451601,
+    0.66208321, 0.62920216, 0.59606986, 0.56287762, 0.52980938, 0.49704014,
+    0.46473455, 0.43304576, 0.40211431, 0.37206735, 0.34301800, 0.31506490,
+    0.28829195, 0.26276832, 0.23854851, 0.21567274, 0.19416736, 0.17404546,
+    0.15530766, 0.13794294, 0.12192957, 0.10723616, 0.09382272, 0.08164178,
+    0.07063950, 0.06075685, 0.05193064, 0.04409466, 0.03718069, 0.03111947,
+    0.02584161, 0.02127838, 0.01736250, 0.01402878, 0.01121463, 0.00886058,
+    0.00691064, 0.00531256, 0.00401805, 0.00298291, 0.00216702, 0.00153438,
+    0.00105297, 0.00069463, 0.00043489, 0.00025272, 0.00013031, 0.0000527734,
+    0.00001000, 0.00000000
+};
+
+static double kaiser10_table[36] = {
+    0.99537781, 1.00000000, 0.99537781, 0.98162644, 0.95908712, 0.92831446,
+    0.89005583, 0.84522401, 0.79486424, 0.74011713, 0.68217934, 0.62226347,
+    0.56155915, 0.50119680, 0.44221549, 0.38553619, 0.33194107, 0.28205962,
+    0.23636152, 0.19515633, 0.15859932, 0.12670280, 0.09935205, 0.07632451,
+    0.05731132, 0.04193980, 0.02979584, 0.02044510, 0.01345224, 0.00839739,
+    0.00488951, 0.00257636, 0.00115101, 0.00035515, 0.00000000, 0.00000000
+};
+
+static double kaiser8_table[36] = {
+    0.99635258, 1.00000000, 0.99635258, 0.98548012, 0.96759014, 0.94302200,
+    0.91223751, 0.87580811, 0.83439927, 0.78875245, 0.73966538, 0.68797126,
+    0.63451750, 0.58014482, 0.52566725, 0.47185369, 0.41941150, 0.36897272,
+    0.32108304, 0.27619388, 0.23465776, 0.19672670, 0.16255380, 0.13219758,
+    0.10562887, 0.08273982, 0.06335451, 0.04724088, 0.03412321, 0.02369490,
+    0.01563093, 0.00959968, 0.00527363, 0.00233883, 0.00050000, 0.00000000
+};
+
+static double kaiser6_table[36] = {
+    0.99733006, 1.00000000, 0.99733006, 0.98935595, 0.97618418, 0.95799003,
+    0.93501423, 0.90755855, 0.87598009, 0.84068475, 0.80211977, 0.76076565,
+    0.71712752, 0.67172623, 0.62508937, 0.57774224, 0.53019925, 0.48295561,
+    0.43647969, 0.39120616, 0.34752997, 0.30580127, 0.26632152, 0.22934058,
+    0.19505503, 0.16360756, 0.13508755, 0.10953262, 0.08693120, 0.06722600,
+    0.05031820, 0.03607231, 0.02432151, 0.01487334, 0.00752000, 0.00000000
+};
+
+struct FuncDef {
+    double *table;
+    int oversample;
+};
+
+static struct FuncDef _KAISER12 = {kaiser12_table, 64};
+#define KAISER12 (&_KAISER12)
+static struct FuncDef _KAISER10 = {kaiser10_table, 32};
+#define KAISER10 (&_KAISER10)
+static struct FuncDef _KAISER8 = {kaiser8_table, 32};
+#define KAISER8 (&_KAISER8)
+static struct FuncDef _KAISER6 = {kaiser6_table, 32};
+#define KAISER6 (&_KAISER6)
+
+struct QualityMapping {
+    int base_length;
+    int oversample;
+    float downsample_bandwidth;
+    float upsample_bandwidth;
+
+    struct FuncDef *window_func;
+};
+
+
+/* This table maps conversion quality to internal parameters. There are two
+   reasons that explain why the up-sampling bandwidth is larger than the
+   down-sampling bandwidth:
+   1) When up-sampling, we can assume that the spectrum is already attenuated
+      close to the Nyquist rate (from an A/D or a previous resampling filter)
+   2) Any aliasing that occurs very close to the Nyquist rate will be masked
+      by the sinusoids/noise just below the Nyquist rate (guaranteed only for
+      up-sampling).
+*/
+
+static const struct QualityMapping quality_map[11] = {
+   {  8,  4, 0.830f, 0.860f, KAISER6 }, /* Q0 */
+   { 16,  4, 0.850f, 0.880f, KAISER6 }, /* Q1 */
+   { 32,  4, 0.882f, 0.910f, KAISER6 }, /* Q2 */  /* 82.3% cutoff ( ~60 dB stop) 6  */
+   { 48,  8, 0.895f, 0.917f, KAISER8 }, /* Q3 */  /* 84.9% cutoff ( ~80 dB stop) 8  */
+   { 64,  8, 0.921f, 0.940f, KAISER8 }, /* Q4 */  /* 88.7% cutoff ( ~80 dB stop) 8  */
+   { 80, 16, 0.922f, 0.940f, KAISER10}, /* Q5 */  /* 89.1% cutoff (~100 dB stop) 10 */
+   { 96, 16, 0.940f, 0.945f, KAISER10}, /* Q6 */  /* 91.5% cutoff (~100 dB stop) 10 */
+   {128, 16, 0.950f, 0.950f, KAISER10}, /* Q7 */  /* 93.1% cutoff (~100 dB stop) 10 */
+   {160, 16, 0.960f, 0.960f, KAISER10}, /* Q8 */  /* 94.5% cutoff (~100 dB stop) 10 */
+   {192, 32, 0.968f, 0.968f, KAISER12}, /* Q9 */  /* 95.5% cutoff (~100 dB stop) 10 */
+   {256, 32, 0.975f, 0.975f, KAISER12}, /* Q10 */ /* 96.6% cutoff (~100 dB stop) 10 */
+};
+/*8,24,40,56,80,104,128,160,200,256,320*/
+
+static double compute_func(float x, struct FuncDef *func)
+{
+    float y, frac;
+    double interp[4];
+    int ind;
+
+    y = x * func->oversample;
+    ind = (int)floor(y);
+    frac = (y - ind);
+
+    /* CSE with handle the repeated powers */
+    interp[3] =  -0.1666666667 * frac + 0.1666666667 * (frac * frac * frac);
+    interp[2] = frac + 0.5 * (frac * frac) - 0.5 * (frac * frac * frac);
+    interp[0] = -0.3333333333 * frac + 0.5 * (frac * frac) - 0.1666666667 * (frac * frac * frac);
+
+    /* Just to make sure we don't have rounding problems */
+    interp[1] = 1.f - interp[3] - interp[2] - interp[0];
+
+    /*sum = frac*accum[1] + (1-frac)*accum[2];*/
+    return 
+	interp[0]*func->table[ind] + interp[1]*func->table[ind+1] +
+	interp[2]*func->table[ind+2] + interp[3]*func->table[ind+3];
+}
+
+/* The slow way of computing a sinc for the table. Should improve that some day */
+static float sinc(float cutoff, float x, int N, struct FuncDef *window_func)
+{
+    float xx = x * cutoff;
+
+    if (fabsf(x) < 1e-6)
+        return cutoff;
+    else if (fabsf(x) > .5*N)
+        return 0;
+
+    /*FIXME: Can it really be any slower than this? */
+    return cutoff*sin(M_PI*xx) / (M_PI*xx)
+	* compute_func(fabs(2.*x / N), window_func);
+}
+
+static void cubic_coef(float frac, float interp[4])
+{
+    /* Compute interpolation coefficients. I'm not sure whether this
+    corresponds to cubic interpolation but I know it's MMSE-optimal on
+    a sinc */
+
+    interp[0] =  -0.16667f * frac + 0.16667f * frac * frac * frac;
+    interp[1] = frac + 0.5f * frac * frac - 0.5f * frac * frac * frac;
+    interp[3] = -0.33333f * frac + 0.5f * frac * frac - 0.16667f * frac * frac * frac;
+
+    /* Just to make sure we don't have rounding problems */
+    interp[2] = 1. - interp[0] - interp[1] - interp[3];
+}
+
+static int resampler_basic_direct_single(SpeexResamplerState *st, unsigned int channel_index, const float *in, unsigned int *in_len, float *out, unsigned int *out_len)
+{
+    int N = st->filt_len;
+    int out_sample = 0;
+    float *mem;
+    int last_sample = st->last_sample[channel_index];
+    unsigned int samp_frac_num = st->samp_frac_num[channel_index];
+
+    mem = st->mem + channel_index * st->mem_alloc_size;
+
+    while (!(last_sample >= (int)*in_len || out_sample >= (int)*out_len)) {
+
+        int j;
+        float sum = 0;
+
+        /* We already have all the filter coefficients pre-computed in the table */
+        const float *ptr;
+
+        for (j = 0; last_sample - N + 1 + j < 0; j++) {
+            sum += ((float)(mem[last_sample+j]) *
+		    (float)(st->sinc_table[samp_frac_num*st->filt_len+j]));
+        }
+
+        /* Do the new part */
+        if (in != NULL) {
+
+            ptr = in + st->in_stride * (last_sample - N + 1 + j);
+
+            for (; j < N; j++) {
+                sum += ((float)(*ptr) *
+			(float)(st->sinc_table[samp_frac_num*st->filt_len+j]));
+                ptr += st->in_stride;
+            }
+        }
+
+        *out = (sum);
+
+        out += st->out_stride;
+        out_sample++;
+        last_sample += st->int_advance;
+        samp_frac_num += st->frac_advance;
+
+        if (samp_frac_num >= st->den_rate) {
+            samp_frac_num -= st->den_rate;
+            last_sample++;
+        }
+    }
+
+    st->last_sample[channel_index] = last_sample;
+
+    st->samp_frac_num[channel_index] = samp_frac_num;
+    return out_sample;
+}
+
+/* This is the same as the previous function, except with a double-precision accumulator */
+static int resampler_basic_direct_double(SpeexResamplerState *st, unsigned int channel_index, const float *in, unsigned int *in_len, float *out, unsigned int *out_len)
+{
+    int N = st->filt_len;
+    int out_sample = 0;
+    float *mem;
+    int last_sample = st->last_sample[channel_index];
+    unsigned int samp_frac_num = st->samp_frac_num[channel_index];
+
+    mem = st->mem + channel_index * st->mem_alloc_size;
+
+    while (!(last_sample >= (int)*in_len || out_sample >= (int)*out_len)) {
+
+        int j;
+        double sum = 0;
+
+        /* We already have all the filter coefficients pre-computed in
+         * the table */
+        const float *ptr;
+
+        for (j = 0; last_sample - N + 1 + j < 0; j++) {
+            sum += ((float)(mem[last_sample+j]) *
+		    (float)((double)st->sinc_table[samp_frac_num*st->filt_len+j]));
+        }
+
+        /* Do the new part */
+        if (in != NULL) {
+            ptr = in + st->in_stride * (last_sample - N + 1 + j);
+
+            for (; j < N; j++) {
+                sum += ((float)(*ptr) *
+			(float)((double)st->sinc_table[samp_frac_num*st->filt_len+j]));
+                ptr += st->in_stride;
+            }
+        }
+
+        *out = sum;
+
+        out += st->out_stride;
+        out_sample++;
+        last_sample += st->int_advance;
+        samp_frac_num += st->frac_advance;
+
+        if (samp_frac_num >= st->den_rate) {
+            samp_frac_num -= st->den_rate;
+            last_sample++;
+        }
+    }
+
+    st->last_sample[channel_index] = last_sample;
+
+    st->samp_frac_num[channel_index] = samp_frac_num;
+    return out_sample;
+}
+
+static int resampler_basic_interpolate_single(SpeexResamplerState *st, unsigned int channel_index, const float *in, unsigned int *in_len, float *out, unsigned int *out_len)
+{
+    int N = st->filt_len;
+    int out_sample = 0;
+    float *mem;
+    int last_sample = st->last_sample[channel_index];
+    unsigned int samp_frac_num = st->samp_frac_num[channel_index];
+
+    mem = st->mem + channel_index * st->mem_alloc_size;
+
+    while (!(last_sample >= (int)*in_len || out_sample >= (int)*out_len)) {
+
+        int j;
+        float sum = 0;
+
+        /* We need to interpolate the sinc filter */
+        float accum[4] = {0.f, 0.f, 0.f, 0.f};
+        float interp[4];
+        const float *ptr;
+        int offset;
+        float frac;
+
+        offset = samp_frac_num * st->oversample / st->den_rate;
+
+        frac = ((float)((samp_frac_num * st->oversample) % st->den_rate))
+	    / st->den_rate;
+
+        /* This code is written like this to make it easy to optimise
+	 * with SIMD.  For most DSPs, it would be best to split the
+	 * loops in two because most DSPs have only two
+	 * accumulators */
+
+        for (j = 0; last_sample - N + 1 + j < 0; j++) {
+
+            float curr_mem = mem[last_sample+j];
+
+            accum[0] += ((float)(curr_mem) *
+                         (float)(st->sinc_table
+                                 [4 + (j+1)*st->oversample - offset - 2]));
+            accum[1] += ((float)(curr_mem) *
+                         (float)(st->sinc_table
+                                 [4 + (j+1)*st->oversample - offset - 1]));
+            accum[2] += ((float)(curr_mem) *
+                         (float)(st->sinc_table
+                                 [4 + (j+1)*st->oversample - offset]));
+            accum[3] += ((float)(curr_mem) *
+                         (float)(st->sinc_table
+                                 [4 + (j+1)*st->oversample - offset + 1]));
+        }
+
+        if (in != NULL) {
+
+            ptr = in + st->in_stride * (last_sample - N + 1 + j);
+
+            /* Do the new part */
+            for (; j < N; j++) {
+
+                float curr_in = *ptr;
+                ptr += st->in_stride;
+
+                accum[0] += ((float)(curr_in) *
+                             (float)(st->sinc_table
+                                     [4 + (j+1)*st->oversample - offset - 2]));
+                accum[1] += ((float)(curr_in) *
+                             (float)(st->sinc_table
+                                     [4 + (j+1)*st->oversample - offset - 1]));
+                accum[2] += ((float)(curr_in) *
+                             (float)(st->sinc_table
+                                     [4 + (j+1)*st->oversample - offset]));
+                accum[3] += ((float)(curr_in) *
+                             (float)(st->sinc_table
+                                     [4 + (j+1)*st->oversample - offset + 1]));
+            }
+        }
+
+        cubic_coef(frac, interp);
+
+        sum =
+	    ((interp[0]) * (accum[0])) +
+	    ((interp[1]) * (accum[1])) +
+	    ((interp[2]) * (accum[2])) +
+	    ((interp[3]) * (accum[3]));
+
+        *out = (sum);
+        out += st->out_stride;
+        out_sample++;
+        last_sample += st->int_advance;
+        samp_frac_num += st->frac_advance;
+
+        if (samp_frac_num >= st->den_rate) {
+            samp_frac_num -= st->den_rate;
+            last_sample++;
+        }
+    }
+
+    st->last_sample[channel_index] = last_sample;
+    st->samp_frac_num[channel_index] = samp_frac_num;
+    return out_sample;
+}
+
+/* This is the same as the previous function, except with a
+ * double-precision accumulator */
+static int resampler_basic_interpolate_double(SpeexResamplerState *st, unsigned int channel_index, const float *in, unsigned int *in_len, float *out, unsigned int *out_len) 
+{
+    int N = st->filt_len;
+    int out_sample = 0;
+    float *mem;
+    int last_sample = st->last_sample[channel_index];
+    unsigned int samp_frac_num = st->samp_frac_num[channel_index];
+
+    mem = st->mem + channel_index * st->mem_alloc_size;
+
+    while (!(last_sample >= (int)*in_len || out_sample >= (int)*out_len)) {
+
+        int j;
+        float sum = 0;
+
+        /* We need to interpolate the sinc filter */
+        double accum[4] = {0.f, 0.f, 0.f, 0.f};
+        float interp[4];
+        const float *ptr;
+        float alpha = ((float)samp_frac_num) / st->den_rate;
+        int offset = samp_frac_num * st->oversample / st->den_rate;
+        float frac = alpha * st->oversample - offset;
+
+        /* This code is written like this to make it easy to optimise
+	 * with SIMD.  For most DSPs, it would be best to split the
+	 * loops in two because most DSPs have only two
+	 * accumulators */
+
+        for (j = 0; last_sample - N + 1 + j < 0; j++) {
+
+            double curr_mem = mem[last_sample + j];
+
+            accum[0] += ((float)(curr_mem) *
+                         (float)(st->sinc_table
+                                 [4 + (j+1)*st->oversample - offset - 2]));
+            accum[1] += ((float)(curr_mem) *
+                         (float)(st->sinc_table
+                                 [4 + (j+1)*st->oversample - offset - 1]));
+            accum[2] += ((float)(curr_mem) *
+                         (float)(st->sinc_table
+                                 [4 + (j+1)*st->oversample - offset]));
+            accum[3] += ((float)(curr_mem) *
+                         (float)(st->sinc_table
+                                 [4 + (j+1)*st->oversample - offset + 1]));
+        }
+
+        if (in != NULL) {
+
+            ptr = in + st->in_stride * (last_sample - N + 1 + j);
+
+            /* Do the new part */
+            for (; j < N; j++) {
+
+                double curr_in = *ptr;
+                ptr += st->in_stride;
+
+                accum[0] += ((float)(curr_in) *
+                             (float)(st->sinc_table
+                                     [4 + (j+1)*st->oversample - offset - 2]));
+                accum[1] += ((float)(curr_in) *
+                             (float)(st->sinc_table
+                                     [4 + (j+1)*st->oversample - offset - 1]));
+                accum[2] += ((float)(curr_in) *
+                             (float)(st->sinc_table
+                                     [4 + (j+1)*st->oversample - offset]));
+                accum[3] += ((float)(curr_in) *
+                             (float)(st->sinc_table
+                                     [4 + (j+1)*st->oversample - offset + 1]));
+            }
+        }
+
+        cubic_coef(frac, interp);
+
+        sum =
+	    interp[0] * accum[0] +
+	    interp[1] * accum[1] +
+	    interp[2] * accum[2] +
+	    interp[3] * accum[3];
+
+        *out = (sum);
+        out += st->out_stride;
+        out_sample++;
+        last_sample += st->int_advance;
+        samp_frac_num += st->frac_advance;
+
+        if (samp_frac_num >= st->den_rate) {
+            samp_frac_num -= st->den_rate;
+            last_sample++;
+        }
+    }
+
+    st->last_sample[channel_index] = last_sample;
+    st->samp_frac_num[channel_index] = samp_frac_num;
+
+    return out_sample;
+}
+
+static void update_filter(SpeexResamplerState *st)
+{
+    unsigned int old_length;
+
+    /*   fprintf(stderr, "update_filter\n"); */
+
+    old_length = st->filt_len;
+    st->oversample = quality_map[st->quality].oversample;
+    st->filt_len = quality_map[st->quality].base_length;
+
+    if (st->num_rate > st->den_rate) {
+
+        /* down-sampling */
+        st->cutoff = quality_map[st->quality].downsample_bandwidth
+            * st->den_rate / st->num_rate;
+
+        st->filt_len = (unsigned int)
+            ceil(st->filt_len * ((double)st->num_rate / (double)st->den_rate));
+
+        /* Round down to make sure we have a multiple of 4 */
+        st->filt_len &= (~0x3);
+
+        if (2*st->den_rate < st->num_rate)
+            st->oversample >>= 1;
+
+        if (4*st->den_rate < st->num_rate)
+            st->oversample >>= 1;
+
+        if (8*st->den_rate < st->num_rate)
+            st->oversample >>= 1;
+
+        if (16*st->den_rate < st->num_rate)
+            st->oversample >>= 1;
+
+        if (st->oversample < 1)
+            st->oversample = 1;
+
+    } else {
+
+        /* up-sampling */
+        st->cutoff = quality_map[st->quality].upsample_bandwidth;
+    }
+
+    /* Choose the resampling type that requires the least amount of memory */
+
+    if (st->den_rate <= st->oversample) {
+
+        unsigned int i;
+
+        if (!st->sinc_table) {
+
+            st->sinc_table = (float *)speex_alloc
+                (st->filt_len * st->den_rate, sizeof(float));
+
+	} else if (st->sinc_table_alloc < st->filt_len*st->den_rate) {
+
+//		fprintf(stderr,"sinc_table=%p\n",st->sinc_table);
+            st->sinc_table = (float *)speex_realloc
+                (st->sinc_table, st->sinc_table_alloc,
+                 st->filt_len * st->den_rate, sizeof(float));
+            st->sinc_table_alloc = st->filt_len * st->den_rate;
+        }
+
+        for (i = 0; i < st->den_rate; i++) {
+
+            int j;
+
+            for (j = 0; j < st->filt_len; j++) {
+                st->sinc_table[i*st->filt_len+j] = sinc
+                    (st->cutoff,
+                     ((j - (int)st->filt_len / 2 + 1) - ((float)i) / st->den_rate), 
+                     st->filt_len,
+                     quality_map[st->quality].window_func);
+            }
+        }
+
+        if (st->quality > 8) {
+            st->resampler_ptr = resampler_basic_direct_double;
+        } else {
+            st->resampler_ptr = resampler_basic_direct_single;
+        }
+
+        /*      fprintf (stderr, "resampler uses direct sinc table and normalised cutoff %f\n", st->cutoff); */
+
+    } else {
+
+        int i;
+
+        if (!st->sinc_table) {
+
+            st->sinc_table = (float *)speex_alloc
+                ((st->filt_len * st->oversample + 8),  sizeof(float));
+
+	} else if (st->sinc_table_alloc < st->filt_len*st->oversample + 8) {
+
+		//fprintf(stderr,"sinc_table=%p\n",st->sinc_table);
+            st->sinc_table = (float *)speex_realloc
+                (st->sinc_table, st->sinc_table_alloc,
+                 (st->filt_len * st->oversample + 8), sizeof(float));
+            st->sinc_table_alloc = st->filt_len * st->oversample + 8;
+        }
+
+        for (i = -4; i < (int)(st->oversample * st->filt_len + 4); i++) {
+            st->sinc_table[i+4] = sinc
+                (st->cutoff,
+                 (i / (float)st->oversample - st->filt_len / 2),
+                 st->filt_len,
+                 quality_map[st->quality].window_func);
+	}
+
+        if (st->quality > 8)
+            st->resampler_ptr = resampler_basic_interpolate_double;
+        else
+            st->resampler_ptr = resampler_basic_interpolate_single;
+
+        /* fprintf (stderr, "resampler uses interpolated sinc table and normalised cutoff %f\n", st->cutoff); */
+
+        /* fprintf (stderr, "table length %d, filt len %d\n", st->sinc_table_length, st->filt_len); */
+    }
+
+    st->int_advance = st->num_rate / st->den_rate;
+    st->frac_advance = st->num_rate % st->den_rate;
+
+    /* Here's the place where we update the filter memory to take into
+       account the change in filter length. It's probably the messiest
+       part of the code due to handling of lots of corner cases. */
+
+    if (!st->mem) {
+
+        unsigned int i;
+        st->mem = (float*)speex_alloc
+            (st->nb_channels * (st->filt_len - 1), sizeof(float));
+
+        for (i = 0; i < st->nb_channels * (st->filt_len - 1); i++)
+            st->mem[i] = 0;
+
+        st->mem_alloc_size = st->filt_len - 1;
+
+    } else if (!st->started) {
+
+        unsigned int i;
+
+		//fprintf(stderr,"mem=%p\n",st->mem);
+		st->mem = (float*)speex_realloc
+            (st->mem, 0, st->nb_channels * (st->filt_len - 1), sizeof(float));
+
+        for (i = 0; i < st->nb_channels * (st->filt_len - 1); i++)
+            st->mem[i] = 0;
+
+        st->mem_alloc_size = st->filt_len - 1;
+
+    } else if (st->filt_len > old_length) {
+
+        int i;
+
+        /* Increase the filter length */
+
+        int old_alloc_size = st->mem_alloc_size;
+
+        if (st->filt_len - 1 > st->mem_alloc_size) {
+			
+		//fprintf(stderr,"mem=%p\n",st->mem);
+
+            st->mem = (float*)speex_realloc
+                (st->mem, st->nb_channels * (old_length - 1),
+                 st->nb_channels * (st->filt_len - 1), sizeof(float));
+            st->mem_alloc_size = st->filt_len - 1;
+        }
+
+        for (i = st->nb_channels - 1; i >= 0; i--) {
+
+            int j;
+            unsigned int olen = old_length;
+
+	    /*if (st->magic_samples[i])*/
+            {
+
+                /* Try and remove the magic samples as if nothing had happened */
+
+                /* FIXME: This is wrong but for now we need it to
+                 * avoid going over the array bounds */
+
+                olen = old_length + 2 * st->magic_samples[i];
+
+                for (j = old_length - 2 + st->magic_samples[i]; j >= 0; j--) {
+                    st->mem[i*st->mem_alloc_size+j+st->magic_samples[i]] =
+                        st->mem[i*old_alloc_size+j];
+                }
+
+                for (j = 0; j < st->magic_samples[i]; j++) {
+                    st->mem[i*st->mem_alloc_size+j] = 0;
+                }
+
+                st->magic_samples[i] = 0;
+            }
+
+            if (st->filt_len > olen) {
+
+                /* If the new filter length is still bigger than the
+                 * "augmented" length */
+
+                /* Copy data going backward */
+
+                for (j = 0; j < olen - 1; j++) {
+                    st->mem[i*st->mem_alloc_size+(st->filt_len-2-j)] =
+                        st->mem[i*st->mem_alloc_size+(olen-2-j)];
+                }
+
+                /* Then put zeros for lack of anything better */
+                for (; j < st->filt_len - 1; j++) {
+                    st->mem[i*st->mem_alloc_size+(st->filt_len-2-j)] = 0;
+                }
+
+                /* Adjust last_sample */
+                st->last_sample[i] += (st->filt_len - olen) / 2;
+
+            } else {
+
+                /* Put back some of the magic! */
+                st->magic_samples[i] = (olen - st->filt_len) / 2;
+
+                for (j = 0; j < st->filt_len - 1 + st->magic_samples[i]; j++) {
+                    st->mem[i*st->mem_alloc_size+j] =
+                        st->mem[i*st->mem_alloc_size+j+st->magic_samples[i]];
+                }
+            }
+        }
+    } else if (st->filt_len < old_length) {
+
+        unsigned int i;
+
+        /* Reduce filter length, this a bit tricky. We need to store
+           some of the memory as "magic" samples so they can be used
+           directly as input the next time(s) */
+
+        for (i = 0; i < st->nb_channels; i++) {
+
+            unsigned int j;
+            unsigned int old_magic = st->magic_samples[i];
+            st->magic_samples[i] = (old_length - st->filt_len) / 2;
+
+            /* We must copy some of the memory that's no longer used */
+            /* Copy data going backward */
+
+            for (j = 0; j < st->filt_len - 1 + st->magic_samples[i] + old_magic; j++) {
+                st->mem[i*st->mem_alloc_size+j] =
+                    st->mem[i*st->mem_alloc_size+j+st->magic_samples[i]];
+	    }
+
+            st->magic_samples[i] += old_magic;
+        }
+    }
+}
+
+SpeexResamplerState *speex_resampler_init(unsigned int nb_channels, unsigned int in_rate, unsigned int out_rate, int quality, int *err)
+{
+    return speex_resampler_init_frac(nb_channels, in_rate, out_rate,
+				     in_rate, out_rate, quality, err);
+}
+
+SpeexResamplerState *speex_resampler_init_frac(unsigned int nb_channels, unsigned int ratio_num, unsigned int ratio_den, unsigned int in_rate, unsigned int out_rate, int quality, int *err)
+{
+    unsigned int i;
+    SpeexResamplerState *st;
+
+    if (quality > 10 || quality < 0) {
+        if (err) *err = RESAMPLER_ERR_INVALID_ARG;
+        return NULL;
+    }
+
+    st = (SpeexResamplerState *)speex_alloc(1, sizeof(SpeexResamplerState));
+
+    st->initialised = 0;
+    st->started = 0;
+    st->in_rate = 0;
+    st->out_rate = 0;
+    st->num_rate = 0;
+    st->den_rate = 0;
+    st->quality = -1;
+	st->sinc_table = 0;
+    st->sinc_table_length = 0;
+    st->sinc_table_alloc = 0;
+    st->mem_alloc_size = 0;
+    st->filt_len = 0;
+    st->mem = 0;
+    st->resampler_ptr = 0;
+
+    st->cutoff = 1.f;
+    st->nb_channels = nb_channels;
+    st->in_stride = 1;
+    st->out_stride = 1;
+
+    /* Per channel data */
+    st->last_sample = (int*)speex_alloc(nb_channels, sizeof(int));
+    st->magic_samples = (unsigned int*)speex_alloc(nb_channels, sizeof(int));
+    st->samp_frac_num = (unsigned int*)speex_alloc(nb_channels, sizeof(int));
+
+    for (i = 0; i < nb_channels; i++) {
+        st->last_sample[i] = 0;
+        st->magic_samples[i] = 0;
+        st->samp_frac_num[i] = 0;
+    }
+
+    speex_resampler_set_quality(st, quality);
+    speex_resampler_set_rate_frac(st, ratio_num, ratio_den, in_rate, out_rate);
+
+    update_filter(st);
+
+    st->initialised = 1;
+
+    if (err) *err = RESAMPLER_ERR_SUCCESS;
+
+    return st;
+}
+
+void speex_resampler_destroy(SpeexResamplerState *st)
+{
+    speex_free(st->mem);
+    speex_free(st->sinc_table);
+    speex_free(st->last_sample);
+    speex_free(st->magic_samples);
+    speex_free(st->samp_frac_num);
+    speex_free(st);
+}
+
+static int speex_resampler_process_native(SpeexResamplerState *st, unsigned int channel_index, const float *in, unsigned int *in_len, float *out, unsigned int *out_len)
+{
+    int j = 0;
+    int N = st->filt_len;
+    int out_sample = 0;
+    float *mem;
+    unsigned int tmp_out_len = 0;
+
+    mem = st->mem + channel_index * st->mem_alloc_size;
+    st->started = 1;
+
+    /* Handle the case where we have samples left from a reduction in
+     * filter length */
+
+    if (st->magic_samples[channel_index]) {
+
+        int istride_save;
+        unsigned int tmp_in_len;
+        unsigned int tmp_magic;
+
+        istride_save = st->in_stride;
+        tmp_in_len = st->magic_samples[channel_index];
+        tmp_out_len = *out_len;
+
+        /* magic_samples needs to be set to zero to avoid infinite recursion */
+        tmp_magic = st->magic_samples[channel_index];
+        st->magic_samples[channel_index] = 0;
+        st->in_stride = 1;
+        speex_resampler_process_native(st, channel_index, mem + N-1,
+                                       &tmp_in_len, out, &tmp_out_len);
+        st->in_stride = istride_save;
+
+        /* If we couldn't process all "magic" input samples, save the
+         * rest for next time */
+
+        if (tmp_in_len < tmp_magic) {
+
+            unsigned int i;
+
+            st->magic_samples[channel_index] = tmp_magic - tmp_in_len;
+
+            for (i = 0; i < st->magic_samples[channel_index]; i++) {
+                mem[N-1+i] = mem[N-1+i+tmp_in_len];
+            }
+        }
+
+        out += tmp_out_len * st->out_stride;
+        *out_len -= tmp_out_len;
+    }
+
+    /* Call the right resampler through the function ptr */
+    out_sample = st->resampler_ptr(st, channel_index,
+                                   in, in_len, out, out_len);
+
+    if (st->last_sample[channel_index] < (int)*in_len) {
+        *in_len = st->last_sample[channel_index];
+    }
+
+    *out_len = out_sample + tmp_out_len;
+
+    st->last_sample[channel_index] -= *in_len;
+
+    for (j = 0; j < N-1 - (int)*in_len; j++) {
+        mem[j] = mem[j+*in_len];
+    }
+
+    if (in != NULL) {
+        for ( ; j < N-1; j++) mem[j] = in[st->in_stride*(j+*in_len-N+1)];
+    } else {
+        for ( ; j < N-1; j++) mem[j] = 0;
+    }
+
+    return RESAMPLER_ERR_SUCCESS;
+}
+
+int speex_resampler_process_float(SpeexResamplerState *st, unsigned int channel_index, const float *in, unsigned int *in_len, float *out, unsigned int *out_len)
+{
+    return speex_resampler_process_native(st, channel_index, in, in_len, out, out_len);
+}
+
+int speex_resampler_process_interleaved_float(SpeexResamplerState *st, const float *in, unsigned int *in_len, float *out, unsigned int *out_len)
+{
+    unsigned int i;
+    int istride_save, ostride_save;
+    unsigned int bak_len = *out_len;
+
+    istride_save = st->in_stride;
+    ostride_save = st->out_stride;
+    st->in_stride = st->out_stride = st->nb_channels;
+
+    for (i = 0; i < st->nb_channels; i++) {
+
+        *out_len = bak_len;
+
+        if (in != NULL) {
+            speex_resampler_process_float(st, i, in + i, in_len, out + i, out_len);
+        } else {
+            speex_resampler_process_float(st, i, NULL, in_len, out + i, out_len);
+        }
+    }
+
+    st->in_stride = istride_save;
+    st->out_stride = ostride_save;
+
+    return RESAMPLER_ERR_SUCCESS;
+}
+
+int speex_resampler_set_rate(SpeexResamplerState *st, unsigned int in_rate, unsigned int out_rate)
+{
+    return speex_resampler_set_rate_frac(st, in_rate, out_rate, in_rate, out_rate);
+}
+
+void speex_resampler_get_rate(SpeexResamplerState *st, unsigned int *in_rate, unsigned int *out_rate)
+{
+    *in_rate = st->in_rate;
+    *out_rate = st->out_rate;
+}
+
+static unsigned int gcd(unsigned int a, unsigned int b)
+{
+    /* Euclid */
+
+    while (b) {
+        unsigned int tmp = b;
+        b = a % b;
+        a = tmp;
+    }
+
+    return a;
+}
+
+int speex_resampler_set_rate_frac(SpeexResamplerState *st, unsigned int ratio_num, unsigned int ratio_den, unsigned int in_rate, unsigned int out_rate)
+{
+    unsigned int old_den;
+    unsigned int i;
+	unsigned int g;
+
+    if (st->in_rate == in_rate && st->out_rate == out_rate &&
+	st->num_rate == ratio_num && st->den_rate == ratio_den) {
+        return RESAMPLER_ERR_SUCCESS;
+    }
+
+    old_den = st->den_rate;
+
+    st->in_rate = in_rate;
+    st->out_rate = out_rate;
+
+    st->num_rate = ratio_num;
+    st->den_rate = ratio_den;
+
+    g = gcd(st->num_rate, st->den_rate);
+
+    st->num_rate /= g;
+    st->den_rate /= g;
+
+    if (old_den > 0) {
+
+        for (i = 0; i < st->nb_channels; i++) {
+
+            st->samp_frac_num[i] = st->samp_frac_num[i] * st->den_rate / old_den;
+
+            if (st->samp_frac_num[i] >= st->den_rate) {
+                st->samp_frac_num[i] = st->den_rate - 1;
+	    }
+        }
+    }
+
+    if (st->initialised) {
+        update_filter(st);
+    }
+
+    return RESAMPLER_ERR_SUCCESS;
+}
+
+void speex_resampler_get_ratio(SpeexResamplerState *st, unsigned int *ratio_num, unsigned int *ratio_den)
+{
+    *ratio_num = st->num_rate;
+    *ratio_den = st->den_rate;
+}
+
+int speex_resampler_set_quality(SpeexResamplerState *st, int quality)
+{
+    if (quality > 10 || quality < 0) {
+        return RESAMPLER_ERR_INVALID_ARG;
+    }
+
+    if (st->quality == quality) {
+        return RESAMPLER_ERR_SUCCESS;
+    }
+
+    st->quality = quality;
+
+    if (st->initialised) {
+        update_filter(st); 
+    }
+
+    return RESAMPLER_ERR_SUCCESS;
+}
+
+void speex_resampler_get_quality(SpeexResamplerState *st, int *quality)
+{
+    *quality = st->quality;
+}
+
+void speex_resampler_set_input_stride(SpeexResamplerState *st, unsigned int stride)
+{
+    st->in_stride = stride;
+}
+
+void speex_resampler_get_input_stride(SpeexResamplerState *st, unsigned int *stride)
+{
+    *stride = st->in_stride;
+}
+
+void speex_resampler_set_output_stride(SpeexResamplerState *st, unsigned int stride)
+{
+    st->out_stride = stride;
+}
+
+void speex_resampler_get_output_stride(SpeexResamplerState *st, unsigned int *stride)
+{
+    *stride = st->out_stride;
+}
+
+int speex_resampler_get_input_latency(SpeexResamplerState *st)
+{
+    return st->filt_len / 2;
+}
+
+int speex_resampler_get_output_latency(SpeexResamplerState *st) 
+{
+    return ((st->filt_len / 2) * st->den_rate + (st->num_rate >> 1)) / st->num_rate;
+}
+
+int speex_resampler_skip_zeros(SpeexResamplerState *st)
+{
+    unsigned int i;
+
+    for (i = 0; i < st->nb_channels; i++) {
+        st->last_sample[i] = st->filt_len / 2;
+    }
+
+    return RESAMPLER_ERR_SUCCESS;
+}
+
+int speex_resampler_reset_mem(SpeexResamplerState *st)
+{
+    unsigned int i;
+
+    for (i = 0; i < st->nb_channels*(st->filt_len - 1); i++) {
+        st->mem[i] = 0;
+    }
+
+    return RESAMPLER_ERR_SUCCESS;
+}
+
+const char *speex_resampler_strerror(int err)
+{
+    switch (err) {
+
+    case RESAMPLER_ERR_SUCCESS:
+        return "Success.";
+
+    case RESAMPLER_ERR_ALLOC_FAILED:
+        return "Memory allocation failed.";
+
+    case RESAMPLER_ERR_BAD_STATE:
+        return "Bad resampler state.";
+
+    case RESAMPLER_ERR_INVALID_ARG:
+        return "Invalid argument.";
+
+    case RESAMPLER_ERR_PTR_OVERLAP:
+        return "Input and output buffers overlap.";
+
+    default:
+        return "Unknown error. Bad error code or strange version mismatch.";
+    }
+}
diff --git a/src/speex/speex_resampler.h b/src/speex/speex_resampler.h
new file mode 100644
index 0000000..2e99955
--- /dev/null
+++ b/src/speex/speex_resampler.h
@@ -0,0 +1,301 @@
+/* Copyright (C) 2007 Jean-Marc Valin
+      
+   File: speex_resampler.h
+   Resampling code
+      
+   The design goals of this code are:
+      - Very fast algorithm
+      - Low memory requirement
+      - Good *perceptual* quality (and not best SNR)
+
+   Redistribution and use in source and binary forms, with or without
+   modification, are permitted provided that the following conditions are
+   met:
+
+   1. Redistributions of source code must retain the above copyright notice,
+   this list of conditions and the following disclaimer.
+
+   2. Redistributions in binary form must reproduce the above copyright
+   notice, this list of conditions and the following disclaimer in the
+   documentation and/or other materials provided with the distribution.
+
+   3. The name of the author may not be used to endorse or promote products
+   derived from this software without specific prior written permission.
+
+   THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
+   IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+   OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+   DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT,
+   INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+   (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+   SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+   HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+   STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
+   ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+   POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#ifndef SPEEX_RESAMPLER_H
+#define SPEEX_RESAMPLER_H
+
+/********* WARNING: MENTAL SANITY ENDS HERE *************/
+
+/* If the resampler is defined outside of Speex, we change the symbol
+   names so that there won't be any clash if linking with Speex later
+   on. */
+
+#define RANDOM_PREFIX rubberband
+
+#ifndef RANDOM_PREFIX
+#error "Please define RANDOM_PREFIX (above) to something specific to your project to prevent symbol name clashes"
+#endif
+
+#define CAT_PREFIX2(a,b) a ## b
+#define CAT_PREFIX(a,b) CAT_PREFIX2(a, b)
+      
+#define speex_resampler_init CAT_PREFIX(RANDOM_PREFIX,_resampler_init)
+#define speex_resampler_init_frac CAT_PREFIX(RANDOM_PREFIX,_resampler_init_frac)
+#define speex_resampler_destroy CAT_PREFIX(RANDOM_PREFIX,_resampler_destroy)
+#define speex_resampler_process_float CAT_PREFIX(RANDOM_PREFIX,_resampler_process_float)
+#define speex_resampler_process_int CAT_PREFIX(RANDOM_PREFIX,_resampler_process_int)
+#define speex_resampler_process_interleaved_float CAT_PREFIX(RANDOM_PREFIX,_resampler_process_interleaved_float)
+#define speex_resampler_process_interleaved_int CAT_PREFIX(RANDOM_PREFIX,_resampler_process_interleaved_int)
+#define speex_resampler_set_rate CAT_PREFIX(RANDOM_PREFIX,_resampler_set_rate)
+#define speex_resampler_get_rate CAT_PREFIX(RANDOM_PREFIX,_resampler_get_rate)
+#define speex_resampler_set_rate_frac CAT_PREFIX(RANDOM_PREFIX,_resampler_set_rate_frac)
+#define speex_resampler_get_ratio CAT_PREFIX(RANDOM_PREFIX,_resampler_get_ratio)
+#define speex_resampler_set_quality CAT_PREFIX(RANDOM_PREFIX,_resampler_set_quality)
+#define speex_resampler_get_quality CAT_PREFIX(RANDOM_PREFIX,_resampler_get_quality)
+#define speex_resampler_set_input_stride CAT_PREFIX(RANDOM_PREFIX,_resampler_set_input_stride)
+#define speex_resampler_get_input_stride CAT_PREFIX(RANDOM_PREFIX,_resampler_get_input_stride)
+#define speex_resampler_set_output_stride CAT_PREFIX(RANDOM_PREFIX,_resampler_set_output_stride)
+#define speex_resampler_get_output_stride CAT_PREFIX(RANDOM_PREFIX,_resampler_get_output_stride)
+#define speex_resampler_get_input_latency CAT_PREFIX(RANDOM_PREFIX,_resampler_get_input_latency)
+#define speex_resampler_get_output_latency CAT_PREFIX(RANDOM_PREFIX,_resampler_get_output_latency)
+#define speex_resampler_skip_zeros CAT_PREFIX(RANDOM_PREFIX,_resampler_skip_zeros)
+#define speex_resampler_reset_mem CAT_PREFIX(RANDOM_PREFIX,_resampler_reset_mem)
+#define speex_resampler_strerror CAT_PREFIX(RANDOM_PREFIX,_resampler_strerror)
+
+#define spx_int16_t short
+#define spx_int32_t int
+#define spx_uint16_t unsigned short
+#define spx_uint32_t unsigned int
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define SPEEX_RESAMPLER_QUALITY_MAX 10
+#define SPEEX_RESAMPLER_QUALITY_MIN 0
+#define SPEEX_RESAMPLER_QUALITY_DEFAULT 4
+#define SPEEX_RESAMPLER_QUALITY_VOIP 3
+#define SPEEX_RESAMPLER_QUALITY_DESKTOP 5
+
+enum {
+   RESAMPLER_ERR_SUCCESS         = 0,
+   RESAMPLER_ERR_ALLOC_FAILED    = 1,
+   RESAMPLER_ERR_BAD_STATE       = 2,
+   RESAMPLER_ERR_INVALID_ARG     = 3,
+   RESAMPLER_ERR_PTR_OVERLAP     = 4,
+   
+   RESAMPLER_ERR_MAX_ERROR
+};
+
+struct SpeexResamplerState_;
+typedef struct SpeexResamplerState_ SpeexResamplerState;
+
+/** Create a new resampler with integer input and output rates.
+ * @param nb_channels Number of channels to be processed
+ * @param in_rate Input sampling rate (integer number of Hz).
+ * @param out_rate Output sampling rate (integer number of Hz).
+ * @param quality Resampling quality between 0 and 10, where 0 has poor quality
+ * and 10 has very high quality.
+ * @return Newly created resampler state
+ * @retval NULL Error: not enough memory
+ */
+SpeexResamplerState *speex_resampler_init(spx_uint32_t nb_channels, 
+                                          spx_uint32_t in_rate, 
+                                          spx_uint32_t out_rate, 
+                                          int quality,
+                                          int *err);
+
+/** Create a new resampler with fractional input/output rates. The sampling 
+ * rate ratio is an arbitrary rational number with both the numerator and 
+ * denominator being 32-bit integers.
+ * @param nb_channels Number of channels to be processed
+ * @param ratio_num Numerator of the sampling rate ratio
+ * @param ratio_den Denominator of the sampling rate ratio
+ * @param in_rate Input sampling rate rounded to the nearest integer (in Hz).
+ * @param out_rate Output sampling rate rounded to the nearest integer (in Hz).
+ * @param quality Resampling quality between 0 and 10, where 0 has poor quality
+ * and 10 has very high quality.
+ * @return Newly created resampler state
+ * @retval NULL Error: not enough memory
+ */
+SpeexResamplerState *speex_resampler_init_frac(spx_uint32_t nb_channels, 
+                                               spx_uint32_t ratio_num, 
+                                               spx_uint32_t ratio_den, 
+                                               spx_uint32_t in_rate, 
+                                               spx_uint32_t out_rate, 
+                                               int quality,
+                                               int *err);
+
+/** Destroy a resampler state.
+ * @param st Resampler state
+ */
+void speex_resampler_destroy(SpeexResamplerState *st);
+
+/** Resample a float array. The input and output buffers must *not* overlap.
+ * @param st Resampler state
+ * @param channel_index Index of the channel to process for the multi-channel 
+ * base (0 otherwise)
+ * @param in Input buffer
+ * @param in_len Number of input samples in the input buffer. Returns the 
+ * number of samples processed
+ * @param out Output buffer
+ * @param out_len Size of the output buffer. Returns the number of samples written
+ */
+int speex_resampler_process_float(SpeexResamplerState *st, 
+                                   spx_uint32_t channel_index, 
+                                   const float *in, 
+                                   spx_uint32_t *in_len, 
+                                   float *out, 
+                                   spx_uint32_t *out_len);
+
+/** Resample an interleaved float array. The input and output buffers must *not* overlap.
+ * @param st Resampler state
+ * @param in Input buffer
+ * @param in_len Number of input samples in the input buffer. Returns the number
+ * of samples processed. This is all per-channel.
+ * @param out Output buffer
+ * @param out_len Size of the output buffer. Returns the number of samples written.
+ * This is all per-channel.
+ */
+int speex_resampler_process_interleaved_float(SpeexResamplerState *st, 
+                                               const float *in, 
+                                               spx_uint32_t *in_len, 
+                                               float *out, 
+                                               spx_uint32_t *out_len);
+
+/** Set (change) the input/output sampling rates (integer value).
+ * @param st Resampler state
+ * @param in_rate Input sampling rate (integer number of Hz).
+ * @param out_rate Output sampling rate (integer number of Hz).
+ */
+int speex_resampler_set_rate(SpeexResamplerState *st, 
+                              spx_uint32_t in_rate, 
+                              spx_uint32_t out_rate);
+
+/** Get the current input/output sampling rates (integer value).
+ * @param st Resampler state
+ * @param in_rate Input sampling rate (integer number of Hz) copied.
+ * @param out_rate Output sampling rate (integer number of Hz) copied.
+ */
+void speex_resampler_get_rate(SpeexResamplerState *st, 
+                              spx_uint32_t *in_rate, 
+                              spx_uint32_t *out_rate);
+
+/** Set (change) the input/output sampling rates and resampling ratio 
+ * (fractional values in Hz supported).
+ * @param st Resampler state
+ * @param ratio_num Numerator of the sampling rate ratio
+ * @param ratio_den Denominator of the sampling rate ratio
+ * @param in_rate Input sampling rate rounded to the nearest integer (in Hz).
+ * @param out_rate Output sampling rate rounded to the nearest integer (in Hz).
+ */
+int speex_resampler_set_rate_frac(SpeexResamplerState *st, 
+                                   spx_uint32_t ratio_num, 
+                                   spx_uint32_t ratio_den, 
+                                   spx_uint32_t in_rate, 
+                                   spx_uint32_t out_rate);
+
+/** Get the current resampling ratio. This will be reduced to the least
+ * common denominator.
+ * @param st Resampler state
+ * @param ratio_num Numerator of the sampling rate ratio copied
+ * @param ratio_den Denominator of the sampling rate ratio copied
+ */
+void speex_resampler_get_ratio(SpeexResamplerState *st, 
+                               spx_uint32_t *ratio_num, 
+                               spx_uint32_t *ratio_den);
+
+/** Set (change) the conversion quality.
+ * @param st Resampler state
+ * @param quality Resampling quality between 0 and 10, where 0 has poor 
+ * quality and 10 has very high quality.
+ */
+int speex_resampler_set_quality(SpeexResamplerState *st, 
+                                 int quality);
+
+/** Get the conversion quality.
+ * @param st Resampler state
+ * @param quality Resampling quality between 0 and 10, where 0 has poor 
+ * quality and 10 has very high quality.
+ */
+void speex_resampler_get_quality(SpeexResamplerState *st, 
+                                 int *quality);
+
+/** Set (change) the input stride.
+ * @param st Resampler state
+ * @param stride Input stride
+ */
+void speex_resampler_set_input_stride(SpeexResamplerState *st, 
+                                      spx_uint32_t stride);
+
+/** Get the input stride.
+ * @param st Resampler state
+ * @param stride Input stride copied
+ */
+void speex_resampler_get_input_stride(SpeexResamplerState *st, 
+                                      spx_uint32_t *stride);
+
+/** Set (change) the output stride.
+ * @param st Resampler state
+ * @param stride Output stride
+ */
+void speex_resampler_set_output_stride(SpeexResamplerState *st, 
+                                      spx_uint32_t stride);
+
+/** Get the output stride.
+ * @param st Resampler state copied
+ * @param stride Output stride
+ */
+void speex_resampler_get_output_stride(SpeexResamplerState *st, 
+                                      spx_uint32_t *stride);
+
+/** Get the latency in input samples introduced by the resampler.
+ * @param st Resampler state
+ */
+int speex_resampler_get_input_latency(SpeexResamplerState *st);
+
+/** Get the latency in output samples introduced by the resampler.
+ * @param st Resampler state
+ */
+int speex_resampler_get_output_latency(SpeexResamplerState *st);
+
+/** Make sure that the first samples to go out of the resamplers don't have 
+ * leading zeros. This is only useful before starting to use a newly created 
+ * resampler. It is recommended to use that when resampling an audio file, as
+ * it will generate a file with the same length. For real-time processing,
+ * it is probably easier not to use this call (so that the output duration
+ * is the same for the first frame).
+ * @param st Resampler state
+ */
+int speex_resampler_skip_zeros(SpeexResamplerState *st);
+
+/** Reset a resampler so a new (unrelated) stream can be processed.
+ * @param st Resampler state
+ */
+int speex_resampler_reset_mem(SpeexResamplerState *st);
+
+/** Returns the English meaning for an error code
+ * @param err Error code
+ * @return English string
+ */
+const char *speex_resampler_strerror(int err);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/src/system/Allocators.cpp b/src/system/Allocators.cpp
index 28563d6..4a8ee5b 100644
--- a/src/system/Allocators.cpp
+++ b/src/system/Allocators.cpp
@@ -1,19 +1,31 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include "Allocators.h"
 
+#ifdef HAVE_IPP
+#include <ipps.h>
+#endif
 
 #include <iostream>
 using std::cerr;
@@ -21,6 +33,37 @@ using std::endl;
 
 namespace RubberBand {
 
+#ifdef HAVE_IPP
+
+template <>
+float *allocate(size_t count)
+{
+    float *ptr = ippsMalloc_32f(count);
+    if (!ptr) throw (std::bad_alloc());
+    return ptr;
+}
+
+template <>
+double *allocate(size_t count)
+{
+    double *ptr = ippsMalloc_64f(count);
+    if (!ptr) throw (std::bad_alloc());
+    return ptr;
+}
+
+template <>
+void deallocate(float *ptr)
+{
+    if (ptr) ippsFree((void *)ptr);
+}
+
+template <>
+void deallocate(double *ptr)
+{
+    if (ptr) ippsFree((void *)ptr);
+}
+
+#endif
 
 }
 
diff --git a/src/system/Allocators.h b/src/system/Allocators.h
index 659ef2e..d1eef7a 100644
--- a/src/system/Allocators.h
+++ b/src/system/Allocators.h
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _RUBBERBAND_ALLOCATORS_H_
@@ -34,6 +43,9 @@
 #include <sys/mman.h>
 #endif
 
+#ifdef LACK_BAD_ALLOC
+namespace std { struct bad_alloc { }; }
+#endif
 
 namespace RubberBand {
 
@@ -41,22 +53,55 @@ template <typename T>
 T *allocate(size_t count)
 {
     void *ptr = 0;
+    // 32-byte alignment is required for at least OpenMAX
+    static const int alignment = 32;
+#ifdef USE_OWN_ALIGNED_MALLOC
+    // Alignment must be a power of two, bigger than the pointer
+    // size. Stuff the actual malloc'd pointer in just before the
+    // returned value.  This is the least desirable way to do this --
+    // the other options below are all better
+    size_t allocd = count * sizeof(T) + alignment;
+    void *buf = malloc(allocd);
+    if (buf) {
+        char *adj = (char *)buf;
+        while ((unsigned long long)adj & (alignment-1)) --adj;
+        ptr = ((char *)adj) + alignment;
+        ((void **)ptr)[-1] = buf;
+    }
+#else /* !USE_OWN_ALIGNED_MALLOC */
 #ifdef HAVE_POSIX_MEMALIGN
-    if (posix_memalign(&ptr, 16, count * sizeof(T))) {
+    if (posix_memalign(&ptr, alignment, count * sizeof(T))) {
         ptr = malloc(count * sizeof(T));
     }
-#else 
-    // Note that malloc always aligns to 16 byte boundaries on OS/X,
-    // so we don't need posix_memalign there (which is fortunate,
-    // since it doesn't exist)
+#else /* !HAVE_POSIX_MEMALIGN */
+#ifdef __MSVC__
+    ptr = _aligned_malloc(count * sizeof(T), alignment);
+#else /* !__MSVC__ */
+#warning "No aligned malloc available or defined"
+    // Note that malloc always aligns to 16 byte boundaries on OS/X
     ptr = malloc(count * sizeof(T));
-#endif 
+#endif /* !__MSVC__ */
+#endif /* !HAVE_POSIX_MEMALIGN */
+#endif /* !USE_OWN_ALIGNED_MALLOC */
     if (!ptr) {
+#ifndef NO_EXCEPTIONS
         throw(std::bad_alloc());
+#else
+        abort();
+#endif
     }
     return (T *)ptr;
 }
 
+#ifdef HAVE_IPP
+
+template <>
+float *allocate(size_t count);
+
+template <>
+double *allocate(size_t count);
+
+#endif
 	
 template <typename T>
 T *allocate_and_zero(size_t count)
@@ -69,9 +114,26 @@ T *allocate_and_zero(size_t count)
 template <typename T>
 void deallocate(T *ptr)
 {
+#ifdef USE_OWN_ALIGNED_MALLOC
+    if (ptr) free(((void **)ptr)[-1]);
+#else /* !USE_OWN_ALIGNED_MALLOC */
+#ifdef __MSVC__
+    if (ptr) _aligned_free((void *)ptr);
+#else /* !__MSVC__ */
     if (ptr) free((void *)ptr);
+#endif /* !__MSVC__ */
+#endif /* !USE_OWN_ALIGNED_MALLOC */
 }
 
+#ifdef HAVE_IPP
+
+template <>
+void deallocate(float *);
+
+template <>
+void deallocate(double *);
+
+#endif
 
 /// Reallocate preserving contents but leaving additional memory uninitialised	
 template <typename T>
@@ -159,6 +221,17 @@ T **reallocate_and_zero_extend_channels(T **ptr,
     return newptr;
 }
 
+/// RAII class to call deallocate() on destruction
+template <typename T>
+class Deallocator
+{
+public:
+    Deallocator(T *t) : m_t(t) { }
+    ~Deallocator() { deallocate<T>(m_t); }
+private:
+    T *m_t;
+};
+
 }
 
 #endif
diff --git a/src/system/Thread.cpp b/src/system/Thread.cpp
index 07df4cd..d0e3360 100644
--- a/src/system/Thread.cpp
+++ b/src/system/Thread.cpp
@@ -1,25 +1,37 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
+#ifndef NO_THREADING
 
 #include "Thread.h"
 
 #include <iostream>
 #include <cstdlib>
 
+#ifdef USE_PTHREADS
 #include <sys/time.h>
 #include <time.h>
+#endif
 
 using std::cerr;
 using std::endl;
@@ -280,6 +292,7 @@ Condition::signal()
 
 #else /* !_WIN32 */
 
+#ifdef USE_PTHREADS
 
 Thread::Thread() :
     m_id(0),
@@ -541,6 +554,93 @@ Condition::signal()
     pthread_cond_signal(&m_condition);
 }
 
+#else /* !USE_PTHREADS */
+
+Thread::Thread()
+{
+}
+
+Thread::~Thread()
+{
+}
+
+void
+Thread::start()
+{
+    abort();
+}    
+
+void 
+Thread::wait()
+{
+    abort();
+}
+
+Thread::Id
+Thread::id()
+{
+    abort();
+}
+
+bool
+Thread::threadingAvailable()
+{
+    return false;
+}
+
+Mutex::Mutex()
+{
+}
+
+Mutex::~Mutex()
+{
+}
+
+void
+Mutex::lock()
+{
+    abort();
+}
+
+void
+Mutex::unlock()
+{
+    abort();
+}
+
+bool
+Mutex::trylock()
+{
+    abort();
+}
+
+Condition::Condition(const char *)
+{
+}
+
+Condition::~Condition()
+{
+}
+
+void
+Condition::lock()
+{
+    abort();
+}
+
+void 
+Condition::wait(int us)
+{
+    abort();
+}
+
+void
+Condition::signal()
+{
+    abort();
+}
+
+#endif /* !USE_PTHREADS */
 #endif /* !_WIN32 */
 
 MutexLocker::MutexLocker(Mutex *mutex) :
@@ -560,3 +660,4 @@ MutexLocker::~MutexLocker()
 
 }
 
+#endif
diff --git a/src/system/Thread.h b/src/system/Thread.h
index f5a0053..0ce754e 100644
--- a/src/system/Thread.h
+++ b/src/system/Thread.h
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _RUBBERBAND_THREAD_H_
@@ -17,11 +26,16 @@
 
 #include <string>
 
+#ifndef NO_THREADING
 
 #ifdef _WIN32
 #include <windows.h>
 #else /* !_WIN32 */
+#ifdef USE_PTHREADS
 #include <pthread.h>
+#else /* !USE_PTHREADS */
+#error No thread implementation selected
+#endif /* !USE_PTHREADS */
 #endif /* !_WIN32 */
 
 //#define DEBUG_THREAD 1
@@ -37,7 +51,9 @@ public:
 #ifdef _WIN32
     typedef HANDLE Id;
 #else
+#ifdef USE_PTHREADS
     typedef pthread_t Id;
+#endif
 #endif
 
     Thread();
@@ -59,10 +75,12 @@ private:
     bool m_extant;
     static DWORD WINAPI staticRun(LPVOID lpParam);
 #else
+#ifdef USE_PTHREADS
     pthread_t m_id;
     bool m_extant;
     static void *staticRun(void *);
 #endif
+#endif
 };
 
 class Mutex
@@ -82,12 +100,14 @@ private:
     DWORD m_lockedBy;
 #endif
 #else
+#ifdef USE_PTHREADS
     pthread_mutex_t m_mutex;
 #ifndef NO_THREAD_CHECKS
     pthread_t m_lockedBy;
     bool m_locked;
 #endif
 #endif
+#endif
 };
 
 class MutexLocker
@@ -133,10 +153,12 @@ private:
     HANDLE m_condition;
     bool m_locked;
 #else
+#ifdef USE_PTHREADS
     pthread_mutex_t m_mutex;
     pthread_cond_t m_condition;
     bool m_locked;
 #endif
+#endif
 #ifdef DEBUG_CONDITION
     std::string m_name;
 #endif
@@ -144,5 +166,67 @@ private:
 
 }
 
+#else
+
+/* Stub threading interface. We do not have threading support in this code. */
+
+namespace RubberBand
+{
+
+class Thread
+{
+public:
+    typedef unsigned int Id;
+
+    Thread() { }
+    virtual ~Thread() { }
+
+    Id id() { return 0; }
+
+    void start() { } 
+    void wait() { }
+
+    static bool threadingAvailable() { return false; }
+
+protected:
+    virtual void run() = 0;
+
+private:
+};
+
+class Mutex
+{
+public:
+    Mutex() { }
+    ~Mutex() { }
+
+    void lock() { }
+    void unlock() { }
+    bool trylock() { return false; }
+};
+
+class MutexLocker
+{
+public:
+    MutexLocker(Mutex *) { }
+    ~MutexLocker() { }
+};
+
+class Condition
+{
+public:
+    Condition(std::string name) { }
+    ~Condition() { }
+    
+    void lock() { }
+    void unlock() { }
+    void wait(int us = 0) { }
+
+    void signal() { }
+};
+
+}
+
+#endif /* NO_THREADING */
 
 #endif
diff --git a/src/system/VectorOps.h b/src/system/VectorOps.h
index 0de89d8..8b954ee 100644
--- a/src/system/VectorOps.h
+++ b/src/system/VectorOps.h
@@ -1,21 +1,41 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _RUBBERBAND_VECTOR_OPS_H_
 #define _RUBBERBAND_VECTOR_OPS_H_
 
+#ifdef HAVE_IPP
+#ifndef _MSC_VER
+#include <inttypes.h>
+#endif
+#include <ipps.h>
+#include <ippac.h>
+#endif
 
+#ifdef HAVE_VDSP
+#include <vecLib/vDSP.h>
+#include <vecLib/vForce.h>
+#endif
 
 #include <cstring>
 #include "sysutils.h"
@@ -40,6 +60,33 @@ inline void v_zero(T *const R__ ptr,
     }
 }
 
+#if defined HAVE_IPP
+template<> 
+inline void v_zero(float *const R__ ptr, 
+                   const int count)
+{
+    ippsZero_32f(ptr, count);
+}
+template<> 
+inline void v_zero(double *const R__ ptr,
+                   const int count)
+{
+    ippsZero_64f(ptr, count);
+}
+#elif defined HAVE_VDSP
+template<> 
+inline void v_zero(float *const R__ ptr, 
+                   const int count)
+{
+    vDSP_vclr(ptr, 1, count);
+}
+template<> 
+inline void v_zero(double *const R__ ptr,
+                   const int count)
+{
+    vDSP_vclrD(ptr, 1, count);
+}
+#endif
 
 template<typename T>
 inline void v_zero_channels(T *const R__ *const R__ ptr,
@@ -71,6 +118,22 @@ inline void v_copy(T *const R__ dst,
     }
 }
 
+#if defined HAVE_IPP
+template<>
+inline void v_copy(float *const R__ dst,
+                   const float *const R__ src,
+                   const int count)
+{
+    ippsCopy_32f(src, dst, count);
+}
+template<>
+inline void v_copy(double *const R__ dst,
+                   const double *const R__ src,
+                   const int count)
+{
+    ippsCopy_64f(src, dst, count);
+}
+#endif
 
 template<typename T>
 inline void v_copy_channels(T *const R__ *const R__ dst,
@@ -92,6 +155,22 @@ inline void v_move(T *const dst,
     memmove(dst, src, count * sizeof(T));
 }
 
+#if defined HAVE_IPP
+template<>
+inline void v_move(float *const dst,
+                   const float *const src,
+                   const int count)
+{
+    ippsMove_32f(src, dst, count);
+}
+template<>
+inline void v_move(double *const dst,
+                   const double *const src,
+                   const int count)
+{
+    ippsMove_64f(src, dst, count);
+}
+#endif
 
 template<typename T, typename U>
 inline void v_convert(U *const R__ dst,
@@ -118,6 +197,37 @@ inline void v_convert(double *const R__ dst,
     v_copy(dst, src, count);
 }
 
+#if defined HAVE_IPP
+template<>
+inline void v_convert(double *const R__ dst,
+                      const float *const R__ src,
+                      const int count)
+{
+    ippsConvert_32f64f(src, dst, count);
+}
+template<>
+inline void v_convert(float *const R__ dst,
+                      const double *const R__ src,
+                      const int count)
+{
+    ippsConvert_64f32f(src, dst, count);
+}
+#elif defined HAVE_VDSP
+template<>
+inline void v_convert(double *const R__ dst,
+                      const float *const R__ src,
+                      const int count)
+{
+    vDSP_vspdp((float *)src, 1, dst, 1, count);
+}
+template<>
+inline void v_convert(float *const R__ dst,
+                      const double *const R__ src,
+                      const int count)
+{
+    vDSP_vdpsp((double *)src, 1, dst, 1, count);
+}
+#endif
 
 template<typename T, typename U>
 inline void v_convert_channels(U *const R__ *const R__ dst,
@@ -150,6 +260,21 @@ inline void v_add(T *const R__ dst,
     }
 }
 
+#if defined HAVE_IPP
+template<>
+inline void v_add(float *const R__ dst,
+                  const float *const R__ src,
+                  const int count)
+{
+    ippsAdd_32f_I(src, dst, count);
+}    
+inline void v_add(double *const R__ dst,
+                  const double *const R__ src,
+                  const int count)
+{
+    ippsAdd_64f_I(src, dst, count);
+}    
+#endif
 
 template<typename T>
 inline void v_add_channels(T *const R__ *const R__ dst,
@@ -194,6 +319,21 @@ inline void v_subtract(T *const R__ dst,
     }
 }
 
+#if defined HAVE_IPP
+template<>
+inline void v_subtract(float *const R__ dst,
+                       const float *const R__ src,
+                       const int count)
+{
+    ippsSub_32f_I(src, dst, count);
+}    
+inline void v_subtract(double *const R__ dst,
+                       const double *const R__ src,
+                       const int count)
+{
+    ippsSub_64f_I(src, dst, count);
+}    
+#endif
 
 template<typename T, typename G>
 inline void v_scale(T *const R__ dst,
@@ -205,6 +345,22 @@ inline void v_scale(T *const R__ dst,
     }
 }
 
+#if defined HAVE_IPP 
+template<>
+inline void v_scale(float *const R__ dst,
+                    const float gain,
+                    const int count)
+{
+    ippsMulC_32f_I(gain, dst, count);
+}
+template<>
+inline void v_scale(double *const R__ dst,
+                    const double gain,
+                    const int count)
+{
+    ippsMulC_64f_I(gain, dst, count);
+}
+#endif
 
 template<typename T>
 inline void v_multiply(T *const R__ dst,
@@ -216,6 +372,22 @@ inline void v_multiply(T *const R__ dst,
     }
 }
 
+#if defined HAVE_IPP 
+template<>
+inline void v_multiply(float *const R__ dst,
+                       const float *const R__ src,
+                       const int count)
+{
+    ippsMul_32f_I(src, dst, count);
+}
+template<>
+inline void v_multiply(double *const R__ dst,
+                       const double *const R__ src,
+                       const int count)
+{
+    ippsMul_64f_I(src, dst, count);
+}
+#endif
 
 template<typename T>
 inline void v_multiply(T *const R__ dst,
@@ -238,7 +410,41 @@ inline void v_divide(T *const R__ dst,
     }
 }
 
+#if defined HAVE_IPP 
+template<>
+inline void v_divide(float *const R__ dst,
+                     const float *const R__ src,
+                     const int count)
+{
+    ippsDiv_32f_I(src, dst, count);
+}
+template<>
+inline void v_divide(double *const R__ dst,
+                     const double *const R__ src,
+                     const int count)
+{
+    ippsDiv_64f_I(src, dst, count);
+}
+#endif
 
+#if defined HAVE_IPP 
+template<>
+inline void v_multiply(float *const R__ dst,
+                       const float *const R__ src1,
+                       const float *const R__ src2,
+                       const int count)
+{
+    ippsMul_32f(src1, src2, dst, count);
+}    
+template<>
+inline void v_multiply(double *const R__ dst,
+                       const double *const R__ src1,
+                       const double *const R__ src2,
+                       const int count)
+{
+    ippsMul_64f(src1, src2, dst, count);
+}
+#endif
 
 template<typename T>
 inline void v_multiply_and_add(T *const R__ dst,
@@ -251,6 +457,24 @@ inline void v_multiply_and_add(T *const R__ dst,
     }
 }
 
+#if defined HAVE_IPP
+template<>
+inline void v_multiply_and_add(float *const R__ dst,
+                               const float *const R__ src1,
+                               const float *const R__ src2,
+                               const int count)
+{
+    ippsAddProduct_32f(src1, src2, dst, count);
+}
+template<>
+inline void v_multiply_and_add(double *const R__ dst,
+                               const double *const R__ src1,
+                               const double *const R__ src2,
+                               const int count)
+{
+    ippsAddProduct_64f(src1, src2, dst, count);
+}
+#endif
 
 template<typename T>
 inline T v_sum(const T *const R__ src,
@@ -272,6 +496,41 @@ inline void v_log(T *const R__ dst,
     }
 }
 
+#if defined HAVE_IPP
+template<>
+inline void v_log(float *const R__ dst,
+                  const int count)
+{
+    ippsLn_32f_I(dst, count);
+}
+template<>
+inline void v_log(double *const R__ dst,
+                  const int count)
+{
+    ippsLn_64f_I(dst, count);
+}
+#elif defined HAVE_VDSP
+// no in-place vForce functions for these -- can we use the
+// out-of-place functions with equal input and output vectors? can we
+// use an out-of-place one with temporary buffer and still be faster
+// than doing it any other way?
+template<>
+inline void v_log(float *const R__ dst,
+                  const int count)
+{
+    float tmp[count];
+    vvlogf(tmp, dst, &count);
+    v_copy(dst, tmp, count);
+}
+template<>
+inline void v_log(double *const R__ dst,
+                  const int count)
+{
+    double tmp[count];
+    vvlog(tmp, dst, &count);
+    v_copy(dst, tmp, count);
+}
+#endif
 
 template<typename T>
 inline void v_exp(T *const R__ dst,
@@ -282,6 +541,41 @@ inline void v_exp(T *const R__ dst,
     }
 }
 
+#if defined HAVE_IPP
+template<>
+inline void v_exp(float *const R__ dst,
+                  const int count)
+{
+    ippsExp_32f_I(dst, count);
+}
+template<>
+inline void v_exp(double *const R__ dst,
+                  const int count)
+{
+    ippsExp_64f_I(dst, count);
+}
+#elif defined HAVE_VDSP
+// no in-place vForce functions for these -- can we use the
+// out-of-place functions with equal input and output vectors? can we
+// use an out-of-place one with temporary buffer and still be faster
+// than doing it any other way?
+template<>
+inline void v_exp(float *const R__ dst,
+                  const int count)
+{
+    float tmp[count];
+    vvexpf(tmp, dst, &count);
+    v_copy(dst, tmp, count);
+}
+template<>
+inline void v_exp(double *const R__ dst,
+                  const int count)
+{
+    double tmp[count];
+    vvexp(tmp, dst, &count);
+    v_copy(dst, tmp, count);
+}
+#endif
 
 template<typename T>
 inline void v_sqrt(T *const R__ dst,
@@ -292,6 +586,41 @@ inline void v_sqrt(T *const R__ dst,
     }
 }
 
+#if defined HAVE_IPP
+template<>
+inline void v_sqrt(float *const R__ dst,
+                   const int count)
+{
+    ippsSqrt_32f_I(dst, count);
+}
+template<>
+inline void v_sqrt(double *const R__ dst,
+                   const int count)
+{
+    ippsSqrt_64f_I(dst, count);
+}
+#elif defined HAVE_VDSP
+// no in-place vForce functions for these -- can we use the
+// out-of-place functions with equal input and output vectors? can we
+// use an out-of-place one with temporary buffer and still be faster
+// than doing it any other way?
+template<>
+inline void v_sqrt(float *const R__ dst,
+                   const int count)
+{
+    float tmp[count];
+    vvsqrtf(tmp, dst, &count);
+    v_copy(dst, tmp, count);
+}
+template<>
+inline void v_sqrt(double *const R__ dst,
+                   const int count)
+{
+    double tmp[count];
+    vvsqrt(tmp, dst, &count);
+    v_copy(dst, tmp, count);
+}
+#endif
 
 template<typename T>
 inline void v_square(T *const R__ dst,
@@ -302,6 +631,20 @@ inline void v_square(T *const R__ dst,
     }
 }
 
+#if defined HAVE_IPP
+template<>
+inline void v_square(float *const R__ dst,
+                   const int count)
+{
+    ippsSqr_32f_I(dst, count);
+}
+template<>
+inline void v_square(double *const R__ dst,
+                   const int count)
+{
+    ippsSqr_64f_I(dst, count);
+}
+#endif
 
 template<typename T>
 inline void v_abs(T *const R__ dst,
@@ -312,6 +655,29 @@ inline void v_abs(T *const R__ dst,
     }
 }
 
+#if defined HAVE_IPP
+template<>
+inline void v_abs(float *const R__ dst,
+                  const int count)
+{
+    ippsAbs_32f_I(dst, count);
+}
+template<>
+inline void v_abs(double *const R__ dst,
+                  const int count)
+{
+    ippsAbs_64f_I(dst, count);
+}
+#elif defined HAVE_VDSP
+template<>
+inline void v_abs(float *const R__ dst,
+                  const int count)
+{
+    float tmp[count];
+    vvfabf(tmp, dst, &count);
+    v_copy(dst, tmp, count);
+}
+#endif
 
 template<typename T>
 inline void v_interleave(T *const R__ dst,
@@ -341,6 +707,17 @@ inline void v_interleave(T *const R__ dst,
     }
 }
 
+#if defined HAVE_IPP 
+template<>
+inline void v_interleave(float *const R__ dst,
+                         const float *const R__ *const R__ src,
+                         const int channels, 
+                         const int count)
+{
+    ippsInterleave_32f((const Ipp32f **)src, channels, count, dst);
+}
+// IPP does not (currently?) provide double-precision interleave
+#endif
 
 template<typename T>
 inline void v_deinterleave(T *const R__ *const R__ dst,
@@ -370,6 +747,17 @@ inline void v_deinterleave(T *const R__ *const R__ dst,
     }
 }
 
+#if defined HAVE_IPP
+template<>
+inline void v_deinterleave(float *const R__ *const R__ dst,
+                           const float *const R__ src,
+                           const int channels, 
+                           const int count)
+{
+    ippsDeinterleave_32f((const Ipp32f *)src, channels, count, (Ipp32f **)dst);
+}
+// IPP does not (currently?) provide double-precision deinterleave
+#endif
 
 template<typename T>
 inline void v_fftshift(T *const R__ ptr,
diff --git a/src/system/VectorOpsComplex.cpp b/src/system/VectorOpsComplex.cpp
index 9f68900..809668b 100644
--- a/src/system/VectorOpsComplex.cpp
+++ b/src/system/VectorOpsComplex.cpp
@@ -1,24 +1,198 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include "VectorOpsComplex.h"
 
 #include "system/sysutils.h"
 
+#include <cassert>
+
+#if defined USE_POMMIER_MATHFUN
+#if defined __ARMEL__
+#include "pommier/neon_mathfun.h"
+#else
+#include "pommier/sse_mathfun.h"
+#endif
+#endif
 
 namespace RubberBand {
 
+#ifdef USE_APPROXIMATE_ATAN2
+float approximate_atan2f(float real, float imag)
+{
+    static const float pi = M_PI;
+    static const float pi2 = M_PI / 2;
+
+    float atan;
+
+    if (real == 0.f) {
+
+        if (imag > 0.0f) atan = pi2;
+        else if (imag == 0.0f) atan = 0.0f;
+        else atan = -pi2;
+
+    } else {
+
+        float z = imag/real;
+
+        if (fabsf(z) < 1.f) {
+            atan = z / (1.f + 0.28f * z * z);
+            if (real < 0.f) {
+                if (imag < 0.f) atan -= pi;
+                else atan += pi;
+            }
+        } else {
+            atan = pi2 - z / (z * z + 0.28f);
+            if (imag < 0.f) atan -= pi;
+        }
+    }
+}
+#endif
+
+#if defined USE_POMMIER_MATHFUN
+
+#ifdef __ARMEL__
+typedef union {
+  float f[4];
+  int i[4];
+  v4sf  v;
+} V4SF;
+#else
+typedef ALIGN16_BEG union {
+  float f[4];
+  int i[4];
+  v4sf  v;
+} ALIGN16_END V4SF;
+#endif
+
+void
+v_polar_to_cartesian_pommier(float *const R__ real,
+                             float *const R__ imag,
+                             const float *const R__ mag,
+                             const float *const R__ phase,
+                             const int count)
+{
+    int idx = 0, tidx = 0;
+    int i = 0;
+
+    for (int i = 0; i + 4 < count; i += 4) {
+
+	V4SF fmag, fphase, fre, fim;
+
+        for (int j = 0; j < 3; ++j) {
+            fmag.f[j] = mag[idx];
+            fphase.f[j] = phase[idx++];
+        }
+
+	sincos_ps(fphase.v, &fim.v, &fre.v);
+
+        for (int j = 0; j < 3; ++j) {
+            real[tidx] = fre.f[j] * fmag.f[j];
+            imag[tidx++] = fim.f[j] * fmag.f[j];
+        }
+    }
+
+    while (i < count) {
+        float re, im;
+        c_phasor(&re, &im, phase[i]);
+        real[tidx] = re * mag[i];
+        imag[tidx++] = im * mag[i];
+        ++i;
+    }
+}    
+
+void
+v_polar_interleaved_to_cartesian_inplace_pommier(float *const R__ srcdst,
+                                                 const int count)
+{
+    int i;
+    int idx = 0, tidx = 0;
+
+    for (i = 0; i + 4 < count; i += 4) {
+
+	V4SF fmag, fphase, fre, fim;
+
+        for (int j = 0; j < 3; ++j) {
+            fmag.f[j] = srcdst[idx++];
+            fphase.f[j] = srcdst[idx++];
+        }
+
+	sincos_ps(fphase.v, &fim.v, &fre.v);
+
+        for (int j = 0; j < 3; ++j) {
+            srcdst[tidx++] = fre.f[j] * fmag.f[j];
+            srcdst[tidx++] = fim.f[j] * fmag.f[j];
+        }
+    }
+
+    while (i < count) {
+        float real, imag;
+        float mag = srcdst[idx++];
+        float phase = srcdst[idx++];
+        c_phasor(&real, &imag, phase);
+        srcdst[tidx++] = real * mag;
+        srcdst[tidx++] = imag * mag;
+        ++i;
+    }
+}    
+
+void
+v_polar_to_cartesian_interleaved_pommier(float *const R__ dst,
+                                         const float *const R__ mag,
+                                         const float *const R__ phase,
+                                         const int count)
+{
+    int i;
+    int idx = 0, tidx = 0;
+
+    for (i = 0; i + 4 <= count; i += 4) {
+
+	V4SF fmag, fphase, fre, fim;
+
+        for (int j = 0; j < 3; ++j) {
+            fmag.f[j] = mag[idx];
+            fphase.f[j] = phase[idx];
+            ++idx;
+        }
+
+	sincos_ps(fphase.v, &fim.v, &fre.v);
+
+        for (int j = 0; j < 3; ++j) {
+            dst[tidx++] = fre.f[j] * fmag.f[j];
+            dst[tidx++] = fim.f[j] * fmag.f[j];
+        }
+    }
+
+    while (i < count) {
+        float real, imag;
+        c_phasor(&real, &imag, phase[i]);
+        dst[tidx++] = real * mag[i];
+        dst[tidx++] = imag * mag[i];
+        ++i;
+    }
+}    
+
+#endif
 
 
 }
diff --git a/src/system/VectorOpsComplex.h b/src/system/VectorOpsComplex.h
index c3786a8..519cadf 100644
--- a/src/system/VectorOpsComplex.h
+++ b/src/system/VectorOpsComplex.h
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _RUBBERBAND_VECTOR_OPS_COMPLEX_H_
@@ -26,7 +35,22 @@ inline void c_phasor(T *real, T *imag, T phase)
 {
     //!!! IPP contains ippsSinCos_xxx in ippvm.h -- these are
     //!!! fixed-accuracy, test and compare
-#if defined __GNUC__
+#if defined HAVE_VDSP
+    int one = 1;
+    if (sizeof(T) == sizeof(float)) {
+        vvsincosf((float *)imag, (float *)real, (const float *)&phase, &one);
+    } else {
+        vvsincos((double *)imag, (double *)real, (const double *)&phase, &one);
+    }
+#elif defined LACK_SINCOS
+    if (sizeof(T) == sizeof(float)) {
+        *real = cosf(phase);
+        *imag = sinf(phase);
+    } else {
+        *real = cos(phase);
+        *imag = sin(phase);
+    }
+#elif defined __GNUC__
     if (sizeof(T) == sizeof(float)) {
         sincosf(phase, (float *)imag, (float *)real);
     } else {
@@ -50,23 +74,35 @@ inline void c_magphase(T *mag, T *phase, T real, T imag)
     *phase = atan2(imag, real);
 }
 
+#ifdef USE_APPROXIMATE_ATAN2
+// NB arguments in opposite order from usual for atan2f
+extern float approximate_atan2f(float real, float imag);
+template<>
+inline void c_magphase(float *mag, float *phase, float real, float imag)
+{
+    float atan = approximate_atan2f(real, imag);
+    *phase = atan;
+    *mag = sqrtf(real * real + imag * imag);
+}
+#else
 template<>
 inline void c_magphase(float *mag, float *phase, float real, float imag)
 {
     *mag = sqrtf(real * real + imag * imag);
     *phase = atan2f(imag, real);
 }
+#endif
 
 
-template<typename T>
+template<typename S, typename T> // S source, T target
 void v_polar_to_cartesian(T *const R__ real,
                           T *const R__ imag,
-                          T *const R__ mag,
-                          T *const R__ phase,
+                          const S *const R__ mag,
+                          const S *const R__ phase,
                           const int count)
 {
     for (int i = 0; i < count; ++i) {
-        c_phasor(real + i, imag + i, phase[i]);
+        c_phasor<T>(real + i, imag + i, phase[i]);
     }
     v_multiply(real, mag, count);
     v_multiply(imag, mag, count);
@@ -86,29 +122,117 @@ void v_polar_interleaved_to_cartesian_inplace(T *const R__ srcdst,
     }
 }
 
-template<typename T>
+template<typename S, typename T> // S source, T target
+void v_polar_to_cartesian_interleaved(T *const R__ dst,
+                                      const S *const R__ mag,
+                                      const S *const R__ phase,
+                                      const int count)
+{
+    T real, imag;
+    for (int i = 0; i < count; ++i) {
+        c_phasor<T>(&real, &imag, phase[i]);
+        real *= mag[i];
+        imag *= mag[i];
+        dst[i*2] = real;
+        dst[i*2+1] = imag;
+    }
+}    
+
+#if defined USE_POMMIER_MATHFUN
+void v_polar_to_cartesian_pommier(float *const R__ real,
+                                  float *const R__ imag,
+                                  const float *const R__ mag,
+                                  const float *const R__ phase,
+                                  const int count);
+void v_polar_interleaved_to_cartesian_inplace_pommier(float *const R__ srcdst,
+                                                      const int count);
+void v_polar_to_cartesian_interleaved_pommier(float *const R__ dst,
+                                              const float *const R__ mag,
+                                              const float *const R__ phase,
+                                              const int count);
+
+template<>
+inline void v_polar_to_cartesian(float *const R__ real,
+                                 float *const R__ imag,
+                                 const float *const R__ mag,
+                                 const float *const R__ phase,
+                                 const int count)
+{
+    v_polar_to_cartesian_pommier(real, imag, mag, phase, count);
+}
+
+template<>
+inline void v_polar_interleaved_to_cartesian_inplace(float *const R__ srcdst,
+                                                     const int count)
+{
+    v_polar_interleaved_to_cartesian_inplace_pommier(srcdst, count);
+}
+
+template<>
+inline void v_polar_to_cartesian_interleaved(float *const R__ dst,
+                                             const float *const R__ mag,
+                                             const float *const R__ phase,
+                                             const int count)
+{
+    v_polar_to_cartesian_interleaved_pommier(dst, mag, phase, count);
+}
+
+#endif
+
+template<typename S, typename T> // S source, T target
 void v_cartesian_to_polar(T *const R__ mag,
                           T *const R__ phase,
-                          T *const R__ real,
-                          T *const R__ imag,
+                          const S *const R__ real,
+                          const S *const R__ imag,
                           const int count)
 {
     for (int i = 0; i < count; ++i) {
-        c_magphase(mag + i, phase + i, real[i], imag[i]);
+        c_magphase<T>(mag + i, phase + i, real[i], imag[i]);
     }
 }
 
-template<typename T>
+template<typename S, typename T> // S source, T target
 void v_cartesian_interleaved_to_polar(T *const R__ mag,
                                       T *const R__ phase,
-                                      const T *const R__ src,
+                                      const S *const R__ src,
                                       const int count)
 {
     for (int i = 0; i < count; ++i) {
-        c_magphase(mag + i, phase + i, src[i*2], src[i*2+1]);
+        c_magphase<T>(mag + i, phase + i, src[i*2], src[i*2+1]);
     }
 }
 
+#ifdef HAVE_VDSP
+template<>
+inline void v_cartesian_to_polar(float *const R__ mag,
+                                 float *const R__ phase,
+                                 const float *const R__ real,
+                                 const float *const R__ imag,
+                                 const int count)
+{
+    DSPSplitComplex c;
+    c.realp = const_cast<float *>(real);
+    c.imagp = const_cast<float *>(imag);
+    vDSP_zvmags(&c, 1, phase, 1, count); // using phase as a temporary dest
+    vvsqrtf(mag, phase, &count); // using phase as the source
+    vvatan2f(phase, imag, real, &count);
+}
+template<>
+inline void v_cartesian_to_polar(double *const R__ mag,
+                                 double *const R__ phase,
+                                 const double *const R__ real,
+                                 const double *const R__ imag,
+                                 const int count)
+{
+    // double precision, this is significantly faster than using vDSP_polar
+    DSPDoubleSplitComplex c;
+    c.realp = const_cast<double *>(real);
+    c.imagp = const_cast<double *>(imag);
+    vDSP_zvmagsD(&c, 1, phase, 1, count); // using phase as a temporary dest
+    vvsqrt(mag, phase, &count); // using phase as the source
+    vvatan2(phase, imag, real, &count);
+}
+#endif
 
 template<typename T>
 void v_cartesian_to_polar_interleaved_inplace(T *const R__ srcdst,
diff --git a/src/system/sysutils.cpp b/src/system/sysutils.cpp
index ce6001b..be12074 100644
--- a/src/system/sysutils.cpp
+++ b/src/system/sysutils.cpp
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include "sysutils.h"
@@ -38,7 +47,14 @@
 #include <cstdlib>
 #include <iostream>
 
+#ifdef HAVE_IPP
+#include <ipp.h> // for static init
+#endif
 
+#ifdef HAVE_VDSP
+#include <vecLib/vDSP.h>
+#include <fenv.h>
+#endif
 
 #ifdef _WIN32
 #include <fstream>
@@ -55,17 +71,17 @@ system_get_platform_tag()
 #else /* !_WIN32 */
 #ifdef __APPLE__
     return "osx";
-#else 
+#else /* !__APPLE__ */
 #ifdef __LINUX__
     if (sizeof(long) == 8) {
         return "linux64";
     } else {
         return "linux";
     }
-#else 
+#else /* !__LINUX__ */
     return "posix";
-#endif 
-#endif 
+#endif /* !__LINUX__ */
+#endif /* !__APPLE__ */
 #endif /* !_WIN32 */
 }
 
@@ -192,6 +208,30 @@ void clock_gettime(int, struct timespec *ts)
 
 void system_specific_initialise()
 {
+#if defined HAVE_IPP
+#ifndef USE_IPP_DYNAMIC_LIBS
+//    std::cerr << "Calling ippStaticInit" << std::endl;
+    ippStaticInit();
+#endif
+    ippSetDenormAreZeros(1);
+#elif defined HAVE_VDSP
+#if defined __i386__ || defined __x86_64__ 
+    fesetenv(FE_DFL_DISABLE_SSE_DENORMS_ENV);
+#endif
+#endif
+#if defined __ARMEL__
+    static const unsigned int x = 0x04086060;
+    static const unsigned int y = 0x03000000;
+    int r;
+    asm volatile (
+        "fmrx	%0, fpscr   \n\t"
+        "and	%0, %0, %1  \n\t"
+        "orr	%0, %0, %2  \n\t"
+        "fmxr	fpscr, %0   \n\t"
+        : "=r"(r)
+        : "r"(x), "r"(y)
+	);
+#endif
 }
 
 void system_specific_application_initialise()
@@ -226,9 +266,13 @@ system_get_process_status(int pid)
 #ifdef _WIN32
 void system_memorybarrier()
 {
+#ifdef __MSVC__
+    MemoryBarrier();
+#else /* (mingw) */
     LONG Barrier = 0;
     __asm__ __volatile__("xchgl %%eax,%0 "
                          : "=r" (Barrier));
+#endif
 }
 #else /* !_WIN32 */
 #if (__GNUC__ > 4) || (__GNUC__ == 4 && __GNUC_MINOR__ >= 1)
diff --git a/src/system/sysutils.h b/src/system/sysutils.h
index e61e51c..371736b 100644
--- a/src/system/sysutils.h
+++ b/src/system/sysutils.h
@@ -1,20 +1,33 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _RUBBERBAND_SYSUTILS_H_
 #define _RUBBERBAND_SYSUTILS_H_
 
+#ifdef __MSVC__
+#include "float_cast/float_cast.h"
+#define R__ __restrict
+#endif
 
 #ifdef __GNUC__
 #define R__ __restrict__
@@ -27,11 +40,26 @@
 #ifdef __MINGW32__
 #include <malloc.h>
 #else
+#ifndef __MSVC__
 #include <alloca.h>
 #endif
+#endif
 
+#ifdef __MSVC__
+#include <malloc.h>
+#include <process.h>
+#define alloca _alloca
+#define getpid _getpid
+#endif
 
+#ifdef __MSVC__
+#define uint8_t unsigned __int8
+#define uint16_t unsigned __int16
+#define uint32_t unsigned __int32
+#define ssize_t long
+#else
 #include <stdint.h>
+#endif
 
 #include <math.h>
 
@@ -49,6 +77,7 @@ extern ProcessStatus system_get_process_status(int pid);
 struct timespec { long tv_sec; long tv_nsec; };
 void clock_gettime(int clk_id, struct timespec *p);
 #define CLOCK_MONOTONIC 1
+#define CLOCK_REALTIME 2
 #endif
 
 #ifdef _WIN32
@@ -60,9 +89,15 @@ struct timespec { long tv_sec; long tv_nsec; };
 // always uses GetPerformanceCounter, does not check whether it's valid or not:
 void clock_gettime(int clk_id, struct timespec *p);
 #define CLOCK_MONOTONIC 1
+#define CLOCK_REALTIME 2
 
 #endif
 
+#ifdef __MSVC__
+
+void usleep(unsigned long);
+
+#endif
 
 inline double mod(double x, double y) { return x - (y * floor(x / y)); }
 inline float modf(float x, float y) { return x - (y * float(floor(x / y))); }
@@ -125,5 +160,9 @@ extern void system_memorybarrier();
 
 #endif
 
+#ifdef NO_THREADING
+#undef MBARRIER
+#define MBARRIER() 
+#endif
 
 #endif
diff --git a/vamp/RubberBandVampPlugin.cpp b/vamp/RubberBandVampPlugin.cpp
index ca1a76a..f8e38dc 100644
--- a/vamp/RubberBandVampPlugin.cpp
+++ b/vamp/RubberBandVampPlugin.cpp
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include "RubberBandVampPlugin.h"
diff --git a/vamp/RubberBandVampPlugin.h b/vamp/RubberBandVampPlugin.h
index 28bdf8d..4f350af 100644
--- a/vamp/RubberBandVampPlugin.h
+++ b/vamp/RubberBandVampPlugin.h
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #ifndef _RUBBERBAND_VAMP_PLUGIN_H_
diff --git a/vamp/libmain.cpp b/vamp/libmain.cpp
index 0b4089f..0702c1b 100644
--- a/vamp/libmain.cpp
+++ b/vamp/libmain.cpp
@@ -1,15 +1,24 @@
 /* -*- c-basic-offset: 4 indent-tabs-mode: nil -*-  vi:set ts=8 sts=4 sw=4: */
 
 /*
-    Rubber Band
+    Rubber Band Library
     An audio time-stretching and pitch-shifting library.
-    Copyright 2007-2011 Chris Cannam.
-    
+    Copyright 2007-2012 Particular Programs Ltd.
+
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2 of the
     License, or (at your option) any later version.  See the file
     COPYING included with this distribution for more information.
+
+    Alternatively, if you have a valid commercial licence for the
+    Rubber Band Library obtained by agreement with the copyright
+    holders, you may redistribute and/or modify it under the terms
+    described in that licence.
+
+    If you wish to distribute code using the Rubber Band Library
+    under terms other than those of the GNU General Public License,
+    you must obtain a valid commercial licence before doing so.
 */
 
 #include <vamp/vamp.h>