FTL B3 should be able to run quicksort asm.js test
https://bugs.webkit.org/show_bug.cgi?id=152105

Reviewed by Geoffrey Garen.

This covers making all of the changes needed to run quicksort.js from AsmBench.

- Reintroduced float types to FTLLower since we now have B3::Float.

- Gave FTL::Output the ability to speak of load types and store types separately from LValue
  types. This dodges the problem that B3 doesn't have types for Int8 and Int16 but supports loads
  and stores of that type.

- Implemented Mod in B3 and wrote tests.

I also fixed a pre-existing bug in a test that appeared to only manifest in release builds.

Currently, B3's performance on asm.js tests is not good. It should be easy to fix:

- B3 should strength-reduce the shifting madness that happens in asm.js memory accesses
  https://bugs.webkit.org/show_bug.cgi?id=152106

- B3 constant hoisting should have a story for the asm.js heap constant
  https://bugs.webkit.org/show_bug.cgi?id=152107

* b3/B3CCallValue.h:
* b3/B3Const32Value.cpp:
(JSC::B3::Const32Value::divConstant):
(JSC::B3::Const32Value::modConstant):
(JSC::B3::Const32Value::bitAndConstant):
* b3/B3Const32Value.h:
* b3/B3Const64Value.cpp:
(JSC::B3::Const64Value::divConstant):
(JSC::B3::Const64Value::modConstant):
(JSC::B3::Const64Value::bitAndConstant):
* b3/B3Const64Value.h:
* b3/B3ReduceStrength.cpp:
* b3/B3Validate.cpp:
* b3/B3Value.cpp:
(JSC::B3::Value::divConstant):
(JSC::B3::Value::modConstant):
(JSC::B3::Value::bitAndConstant):
* b3/B3Value.h:
* b3/testb3.cpp:
(JSC::B3::testChillDiv64):
(JSC::B3::testMod):
(JSC::B3::testSwitch):
(JSC::B3::run):
* ftl/FTLB3Output.cpp:
(JSC::FTL::Output::load16ZeroExt32):
(JSC::FTL::Output::store):
(JSC::FTL::Output::store32As8):
(JSC::FTL::Output::store32As16):
(JSC::FTL::Output::loadFloatToDouble): Deleted.
* ftl/FTLB3Output.h:
(JSC::FTL::Output::mul):
(JSC::FTL::Output::div):
(JSC::FTL::Output::chillDiv):
(JSC::FTL::Output::rem):
(JSC::FTL::Output::neg):
(JSC::FTL::Output::load32):
(JSC::FTL::Output::load64):
(JSC::FTL::Output::loadPtr):
(JSC::FTL::Output::loadFloat):
(JSC::FTL::Output::loadDouble):
(JSC::FTL::Output::store32):
(JSC::FTL::Output::store64):
(JSC::FTL::Output::storePtr):
(JSC::FTL::Output::storeFloat):
(JSC::FTL::Output::storeDouble):
(JSC::FTL::Output::addPtr):
(JSC::FTL::Output::extractValue):
(JSC::FTL::Output::call):
(JSC::FTL::Output::operation):
* ftl/FTLLowerDFGToLLVM.cpp:
(JSC::FTL::DFG::LowerDFGToLLVM::compileGetByVal):
(JSC::FTL::DFG::LowerDFGToLLVM::compilePutByVal):
(JSC::FTL::DFG::LowerDFGToLLVM::compileArrayPush):
(JSC::FTL::DFG::LowerDFGToLLVM::compileArrayPop):
* ftl/FTLOutput.cpp:
(JSC::FTL::Output::Output):
(JSC::FTL::Output::store):
(JSC::FTL::Output::check):
(JSC::FTL::Output::load):
* ftl/FTLOutput.h:
(JSC::FTL::Output::load32):
(JSC::FTL::Output::load64):
(JSC::FTL::Output::loadPtr):
(JSC::FTL::Output::loadFloat):
(JSC::FTL::Output::loadDouble):
(JSC::FTL::Output::store32As8):
(JSC::FTL::Output::store32As16):
(JSC::FTL::Output::store32):
(JSC::FTL::Output::store64):
(JSC::FTL::Output::storePtr):
(JSC::FTL::Output::storeFloat):
(JSC::FTL::Output::storeDouble):
(JSC::FTL::Output::addPtr):
(JSC::FTL::Output::loadFloatToDouble): Deleted.
(JSC::FTL::Output::store16): Deleted.



git-svn-id: http://svn.webkit.org/repository/webkit/trunk@193943 268f45cc-cd09-0410-ab3c-d52691b4dbfc
diff --git a/Source/JavaScriptCore/ChangeLog b/Source/JavaScriptCore/ChangeLog
index 9c5ad60..5c1f1f3 100644
--- a/Source/JavaScriptCore/ChangeLog
+++ b/Source/JavaScriptCore/ChangeLog
@@ -1,3 +1,106 @@
+2015-12-09  Filip Pizlo  <fpizlo@apple.com>
+
+        FTL B3 should be able to run quicksort asm.js test
+        https://bugs.webkit.org/show_bug.cgi?id=152105
+
+        Reviewed by Geoffrey Garen.
+
+        This covers making all of the changes needed to run quicksort.js from AsmBench.
+
+        - Reintroduced float types to FTLLower since we now have B3::Float.
+
+        - Gave FTL::Output the ability to speak of load types and store types separately from LValue
+          types. This dodges the problem that B3 doesn't have types for Int8 and Int16 but supports loads
+          and stores of that type.
+
+        - Implemented Mod in B3 and wrote tests.
+
+        I also fixed a pre-existing bug in a test that appeared to only manifest in release builds.
+
+        Currently, B3's performance on asm.js tests is not good. It should be easy to fix:
+
+        - B3 should strength-reduce the shifting madness that happens in asm.js memory accesses
+          https://bugs.webkit.org/show_bug.cgi?id=152106
+
+        - B3 constant hoisting should have a story for the asm.js heap constant
+          https://bugs.webkit.org/show_bug.cgi?id=152107
+
+        * b3/B3CCallValue.h:
+        * b3/B3Const32Value.cpp:
+        (JSC::B3::Const32Value::divConstant):
+        (JSC::B3::Const32Value::modConstant):
+        (JSC::B3::Const32Value::bitAndConstant):
+        * b3/B3Const32Value.h:
+        * b3/B3Const64Value.cpp:
+        (JSC::B3::Const64Value::divConstant):
+        (JSC::B3::Const64Value::modConstant):
+        (JSC::B3::Const64Value::bitAndConstant):
+        * b3/B3Const64Value.h:
+        * b3/B3ReduceStrength.cpp:
+        * b3/B3Validate.cpp:
+        * b3/B3Value.cpp:
+        (JSC::B3::Value::divConstant):
+        (JSC::B3::Value::modConstant):
+        (JSC::B3::Value::bitAndConstant):
+        * b3/B3Value.h:
+        * b3/testb3.cpp:
+        (JSC::B3::testChillDiv64):
+        (JSC::B3::testMod):
+        (JSC::B3::testSwitch):
+        (JSC::B3::run):
+        * ftl/FTLB3Output.cpp:
+        (JSC::FTL::Output::load16ZeroExt32):
+        (JSC::FTL::Output::store):
+        (JSC::FTL::Output::store32As8):
+        (JSC::FTL::Output::store32As16):
+        (JSC::FTL::Output::loadFloatToDouble): Deleted.
+        * ftl/FTLB3Output.h:
+        (JSC::FTL::Output::mul):
+        (JSC::FTL::Output::div):
+        (JSC::FTL::Output::chillDiv):
+        (JSC::FTL::Output::rem):
+        (JSC::FTL::Output::neg):
+        (JSC::FTL::Output::load32):
+        (JSC::FTL::Output::load64):
+        (JSC::FTL::Output::loadPtr):
+        (JSC::FTL::Output::loadFloat):
+        (JSC::FTL::Output::loadDouble):
+        (JSC::FTL::Output::store32):
+        (JSC::FTL::Output::store64):
+        (JSC::FTL::Output::storePtr):
+        (JSC::FTL::Output::storeFloat):
+        (JSC::FTL::Output::storeDouble):
+        (JSC::FTL::Output::addPtr):
+        (JSC::FTL::Output::extractValue):
+        (JSC::FTL::Output::call):
+        (JSC::FTL::Output::operation):
+        * ftl/FTLLowerDFGToLLVM.cpp:
+        (JSC::FTL::DFG::LowerDFGToLLVM::compileGetByVal):
+        (JSC::FTL::DFG::LowerDFGToLLVM::compilePutByVal):
+        (JSC::FTL::DFG::LowerDFGToLLVM::compileArrayPush):
+        (JSC::FTL::DFG::LowerDFGToLLVM::compileArrayPop):
+        * ftl/FTLOutput.cpp:
+        (JSC::FTL::Output::Output):
+        (JSC::FTL::Output::store):
+        (JSC::FTL::Output::check):
+        (JSC::FTL::Output::load):
+        * ftl/FTLOutput.h:
+        (JSC::FTL::Output::load32):
+        (JSC::FTL::Output::load64):
+        (JSC::FTL::Output::loadPtr):
+        (JSC::FTL::Output::loadFloat):
+        (JSC::FTL::Output::loadDouble):
+        (JSC::FTL::Output::store32As8):
+        (JSC::FTL::Output::store32As16):
+        (JSC::FTL::Output::store32):
+        (JSC::FTL::Output::store64):
+        (JSC::FTL::Output::storePtr):
+        (JSC::FTL::Output::storeFloat):
+        (JSC::FTL::Output::storeDouble):
+        (JSC::FTL::Output::addPtr):
+        (JSC::FTL::Output::loadFloatToDouble): Deleted.
+        (JSC::FTL::Output::store16): Deleted.
+
 2015-12-10  Filip Pizlo  <fpizlo@apple.com>
 
         Consider still matching an address expression even if B3 has already assigned a Tmp to it
@@ -227,6 +330,7 @@
         (JSC::ArrayPrototype::finishCreation):
         * runtime/CommonIdentifiers.h:
 
+>>>>>>> .r193940
 2015-12-08  Filip Pizlo  <fpizlo@apple.com>
 
         FTL B3 should have basic GetById support
diff --git a/Source/JavaScriptCore/b3/B3CCallValue.h b/Source/JavaScriptCore/b3/B3CCallValue.h
index 7f2cb8b..327f64b 100644
--- a/Source/JavaScriptCore/b3/B3CCallValue.h
+++ b/Source/JavaScriptCore/b3/B3CCallValue.h
@@ -49,6 +49,7 @@
         : Value(index, CheckedOpcode, CCall, type, origin, arguments...)
         , effects(Effects::forCall())
     {
+        RELEASE_ASSERT(numChildren() >= 1);
     }
 
     template<typename... Arguments>
@@ -56,6 +57,7 @@
         : Value(index, CheckedOpcode, CCall, type, origin, arguments...)
         , effects(effects)
     {
+        RELEASE_ASSERT(numChildren() >= 1);
     }
 };
 
diff --git a/Source/JavaScriptCore/b3/B3ReduceStrength.cpp b/Source/JavaScriptCore/b3/B3ReduceStrength.cpp
index 873eb00..d7aa09b 100644
--- a/Source/JavaScriptCore/b3/B3ReduceStrength.cpp
+++ b/Source/JavaScriptCore/b3/B3ReduceStrength.cpp
@@ -293,6 +293,9 @@
 
         case Mod:
         case ChillMod:
+            // Turn this: Mod(constant1, constant2)
+            // Into this: constant1 / constant2
+            // Note that this uses ChillMod semantics.
             replaceWithNewValue(m_value->child(0)->modConstant(m_proc, m_value->child(1)));
             break;
 
diff --git a/Source/JavaScriptCore/b3/B3Validate.cpp b/Source/JavaScriptCore/b3/B3Validate.cpp
index e00963c..f9965e6 100644
--- a/Source/JavaScriptCore/b3/B3Validate.cpp
+++ b/Source/JavaScriptCore/b3/B3Validate.cpp
@@ -289,8 +289,8 @@
                 validateStackAccess(value);
                 break;
             case CCall:
-                // This is a wildcard. You can pass any non-void arguments and you can select any
-                // return type.
+                VALIDATE(value->numChildren() >= 1, ("At ", *value));
+                VALIDATE(value->child(0)->type() == pointerType(), ("At ", *value));
                 break;
             case Patchpoint:
                 if (value->type() == Void)
diff --git a/Source/JavaScriptCore/b3/B3Value.h b/Source/JavaScriptCore/b3/B3Value.h
index ed2d10d..7ae7d2b 100644
--- a/Source/JavaScriptCore/b3/B3Value.h
+++ b/Source/JavaScriptCore/b3/B3Value.h
@@ -121,7 +121,7 @@
     virtual Value* checkMulConstant(Procedure&, const Value* other) const;
     virtual Value* checkNegConstant(Procedure&) const;
     virtual Value* divConstant(Procedure&, const Value* other) const; // This chooses ChillDiv semantics for integers.
-    virtual Value* modConstant(Procedure&, const Value* other) const; // This chooses ChillMod semantics for integers.
+    virtual Value* modConstant(Procedure&, const Value* other) const; // This chooses ChillMod semantics.
     virtual Value* bitAndConstant(Procedure&, const Value* other) const;
     virtual Value* bitOrConstant(Procedure&, const Value* other) const;
     virtual Value* bitXorConstant(Procedure&, const Value* other) const;
diff --git a/Source/JavaScriptCore/b3/testb3.cpp b/Source/JavaScriptCore/b3/testb3.cpp
index 0604bb0..d784c88 100644
--- a/Source/JavaScriptCore/b3/testb3.cpp
+++ b/Source/JavaScriptCore/b3/testb3.cpp
@@ -3000,12 +3000,13 @@
     Value* result = root->appendNew<Value>(proc, Sqrt, Origin(), asDouble);
     Value* floatResult = root->appendNew<Value>(proc, DoubleToFloat, Origin(), result);
     Value* result32 = root->appendNew<Value>(proc, BitwiseCast, Origin(), floatResult);
-    Value* doubleAddress = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR2);
+    Value* doubleAddress = root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR1);
     root->appendNew<MemoryValue>(proc, Store, Origin(), result, doubleAddress);
     root->appendNew<ControlValue>(proc, Return, Origin(), result32);
 
     double effect = 0;
-    CHECK(isIdentical(compileAndRun<int32_t>(proc, bitwise_cast<int32_t>(a), &effect), bitwise_cast<int32_t>(static_cast<float>(sqrt(a)))));
+    int32_t resultValue = compileAndRun<int32_t>(proc, bitwise_cast<int32_t>(a), &effect);
+    CHECK(isIdentical(resultValue, bitwise_cast<int32_t>(static_cast<float>(sqrt(a)))));
     CHECK(isIdentical(effect, sqrt(a)));
 }
 
@@ -7206,9 +7207,9 @@
             continue;                                   \
         tasks.append(createSharedTask<void()>(          \
             [=] () {                                    \
-                dataLog(testStr, "...\n");              \
+                dataLog(toCString(testStr, "...\n"));   \
                 test(a.value);                          \
-                dataLog(testStr, ": OK!\n");            \
+                dataLog(toCString(testStr, ": OK!\n")); \
             }));                                        \
     }
 
@@ -7220,9 +7221,9 @@
                 continue;                                   \
             tasks.append(createSharedTask<void()>(          \
                 [=] () {                                    \
-                    dataLog(testStr, "...\n");              \
+                    dataLog(toCString(testStr, "...\n"));   \
                     test(a.value, b.value);                 \
-                    dataLog(testStr, ": OK!\n");            \
+                    dataLog(toCString(testStr, ": OK!\n")); \
                 }));                                        \
         }                                                   \
     }
diff --git a/Source/JavaScriptCore/ftl/FTLB3Output.cpp b/Source/JavaScriptCore/ftl/FTLB3Output.cpp
index 5efc67d..293b3e9 100644
--- a/Source/JavaScriptCore/ftl/FTLB3Output.cpp
+++ b/Source/JavaScriptCore/ftl/FTLB3Output.cpp
@@ -97,18 +97,24 @@
     return load;
 }
 
-LValue Output::loadFloatToDouble(TypedPointer pointer)
-{
-    LValue loadedFloat = load(pointer, floatType);
-    return m_block->appendNew<B3::Value>(m_proc, B3::FloatToDouble, origin(), loadedFloat);
-}
-
 void Output::store(LValue value, TypedPointer pointer)
 {
     LValue store = m_block->appendNew<B3::MemoryValue>(m_proc, B3::Store, origin(), value, pointer.value());
     pointer.heap().decorateInstruction(store, *m_heaps);
 }
 
+void Output::store32As8(LValue value, TypedPointer pointer)
+{
+    LValue store = m_block->appendNew<B3::MemoryValue>(m_proc, B3::Store8, origin(), value, pointer.value());
+    pointer.heap().decorateInstruction(store, *m_heaps);
+}
+
+void Output::store32As16(LValue value, TypedPointer pointer)
+{
+    LValue store = m_block->appendNew<B3::MemoryValue>(m_proc, B3::Store16, origin(), value, pointer.value());
+    pointer.heap().decorateInstruction(store, *m_heaps);
+}
+
 LValue Output::baseIndex(LValue base, LValue index, Scale scale, ptrdiff_t offset)
 {
     LValue accumulatedOffset;
diff --git a/Source/JavaScriptCore/ftl/FTLB3Output.h b/Source/JavaScriptCore/ftl/FTLB3Output.h
index f7eee52..6c695b6 100644
--- a/Source/JavaScriptCore/ftl/FTLB3Output.h
+++ b/Source/JavaScriptCore/ftl/FTLB3Output.h
@@ -128,8 +128,8 @@
     LValue mul(LValue left, LValue right) { return m_block->appendNew<B3::Value>(m_proc, B3::Mul, origin(), left, right); }
     LValue div(LValue left, LValue right) { return m_block->appendNew<B3::Value>(m_proc, B3::Div, origin(), left, right); }
     LValue chillDiv(LValue left, LValue right) { return m_block->appendNew<B3::Value>(m_proc, B3::ChillDiv, origin(), left, right); }
-    LValue mod(LValue left, LValue right) { m_block->appendNew<B3::Value>(m_proc, B3::Mod, origin(), left, right); }
-    LValue chillMod(LValue left, LValue right) { m_block->appendNew<B3::Value>(m_proc, B3::ChillMod, origin(), left, right); }
+    LValue mod(LValue left, LValue right) { return m_block->appendNew<B3::Value>(m_proc, B3::Mod, origin(), left, right); }
+    LValue chillMod(LValue left, LValue right) { return m_block->appendNew<B3::Value>(m_proc, B3::ChillMod, origin(), left, right); }
     LValue neg(LValue value)
     {
         LValue zero = m_block->appendIntConstant(m_proc, origin(), value->type(), 0);
@@ -211,12 +211,61 @@
     LValue load32(TypedPointer pointer) { return load(pointer, B3::Int32); }
     LValue load64(TypedPointer pointer) { return load(pointer, B3::Int64); }
     LValue loadPtr(TypedPointer pointer) { return load(pointer, B3::pointerType()); }
-    LValue loadFloatToDouble(TypedPointer);
+    LValue loadFloat(TypedPointer pointer) { return load(pointer, B3::Float); }
     LValue loadDouble(TypedPointer pointer) { return load(pointer, B3::Double); }
-    void store32(LValue value, TypedPointer pointer) { store(value, pointer); }
-    void store64(LValue value, TypedPointer pointer) { store(value, pointer); }
-    void storePtr(LValue value, TypedPointer pointer) { store(value, pointer); }
-    void storeDouble(LValue value, TypedPointer pointer) { store(value, pointer); }
+    void store32As8(LValue value, TypedPointer pointer);
+    void store32As16(LValue value, TypedPointer pointer);
+    void store32(LValue value, TypedPointer pointer)
+    {
+        ASSERT(value->type() == B3::Int32);
+        store(value, pointer);
+    }
+    void store64(LValue value, TypedPointer pointer)
+    {
+        ASSERT(value->type() == B3::Int64);
+        store(value, pointer);
+    }
+    void storePtr(LValue value, TypedPointer pointer)
+    {
+        ASSERT(value->type() == B3::pointerType());
+        store(value, pointer);
+    }
+    void storeFloat(LValue value, TypedPointer pointer)
+    {
+        ASSERT(value->type() == B3::Float);
+        store(value, pointer);
+    }
+    void storeDouble(LValue value, TypedPointer pointer)
+    {
+        ASSERT(value->type() == B3::Double);
+        store(value, pointer);
+    }
+
+    enum LoadType {
+        Load8SignExt32,
+        Load8ZeroExt32,
+        Load16SignExt32,
+        Load16ZeroExt32,
+        Load32,
+        Load64,
+        LoadPtr,
+        LoadFloat,
+        LoadDouble
+    };
+
+    LValue load(TypedPointer, LoadType);
+    
+    enum StoreType {
+        Store32As8,
+        Store32As16,
+        Store32,
+        Store64,
+        StorePtr,
+        StoreFloat,
+        StoreDouble
+    };
+
+    void store(LValue, TypedPointer, StoreType);
 
     LValue addPtr(LValue value, ptrdiff_t immediate = 0)
     {
@@ -342,11 +391,16 @@
     LValue extractValue(LValue aggVal, unsigned index) { CRASH(); }
 
     template<typename VectorType>
-    LValue call(LType type, LValue function, const VectorType& vector) { return m_block->appendNew<B3::CCallValue>(m_proc, type, origin(), B3::Value::AdjacencyList(vector)); }
-    LValue call(LType type, LValue function) { return m_block->appendNew<B3::CCallValue>(m_proc, type, origin()); }
-    LValue call(LType type, LValue function, LValue arg1) { return m_block->appendNew<B3::CCallValue>(m_proc, type, origin(), arg1); }
+    LValue call(LType type, LValue function, const VectorType& vector)
+    {
+        B3::CCallValue* result = m_block->appendNew<B3::CCallValue>(m_proc, type, origin(), function);
+        result->children().appendVector(vector);
+        return result;
+    }
+    LValue call(LType type, LValue function) { return m_block->appendNew<B3::CCallValue>(m_proc, type, origin(), function); }
+    LValue call(LType type, LValue function, LValue arg1) { return m_block->appendNew<B3::CCallValue>(m_proc, type, origin(), function, arg1); }
     template<typename... Args>
-    LValue call(LType type, LValue function, LValue arg1, Args... args) { return m_block->appendNew<B3::CCallValue>(m_proc, type, origin(), arg1, args...); }
+    LValue call(LType type, LValue function, LValue arg1, Args... args) { return m_block->appendNew<B3::CCallValue>(m_proc, type, origin(), function, arg1, args...); }
 
     template<typename FunctionType>
     LValue operation(FunctionType function) { return constIntPtr(bitwise_cast<void*>(function)); }
diff --git a/Source/JavaScriptCore/ftl/FTLLowerDFGToLLVM.cpp b/Source/JavaScriptCore/ftl/FTLLowerDFGToLLVM.cpp
index ee35d2f..17539c1 100644
--- a/Source/JavaScriptCore/ftl/FTLLowerDFGToLLVM.cpp
+++ b/Source/JavaScriptCore/ftl/FTLLowerDFGToLLVM.cpp
@@ -3020,7 +3020,7 @@
                 LValue result;
                 switch (type) {
                 case TypeFloat32:
-                    result = m_out.loadFloatToDouble(pointer);
+                    result = m_out.fpCast(m_out.loadFloat(pointer), m_out.doubleType);
                     break;
                 case TypeFloat64:
                     result = m_out.loadDouble(pointer);
@@ -3186,10 +3186,6 @@
         }
             
         default:
-#if FTL_USES_B3
-            UNUSED_PARAM(child5);
-            CRASH();
-#else
             TypedArrayType type = m_node->arrayMode().typedArrayType();
             
             if (isTypedView(type)) {
@@ -3201,7 +3197,7 @@
                             m_out.zeroExt(index, m_out.intPtr),
                             m_out.constIntPtr(logElementSize(type)))));
                 
-                LType refType;
+                Output::StoreType storeType;
                 LValue valueToStore;
                 
                 if (isInt(type)) {
@@ -3276,19 +3272,17 @@
                     default:
                         DFG_CRASH(m_graph, m_node, "Bad use kind");
                     }
-                    
+
+                    valueToStore = intValue;
                     switch (elementSize(type)) {
                     case 1:
-                        valueToStore = m_out.intCast(intValue, m_out.int8);
-                        refType = m_out.ref8;
+                        storeType = Output::Store32As8;
                         break;
                     case 2:
-                        valueToStore = m_out.intCast(intValue, m_out.int16);
-                        refType = m_out.ref16;
+                        storeType = Output::Store32As16;
                         break;
                     case 4:
-                        valueToStore = intValue;
-                        refType = m_out.ref32;
+                        storeType = Output::Store32;
                         break;
                     default:
                         DFG_CRASH(m_graph, m_node, "Bad element size");
@@ -3297,12 +3291,12 @@
                     LValue value = lowDouble(child3);
                     switch (type) {
                     case TypeFloat32:
-                        valueToStore = value;
-                        refType = m_out.refFloat;
+                        valueToStore = m_out.fpCast(value, m_out.floatType);
+                        storeType = Output::StoreFloat;
                         break;
                     case TypeFloat64:
                         valueToStore = value;
-                        refType = m_out.refDouble;
+                        storeType = Output::StoreDouble;
                         break;
                     default:
                         DFG_CRASH(m_graph, m_node, "Bad typed array type");
@@ -3310,7 +3304,7 @@
                 }
                 
                 if (m_node->arrayMode().isInBounds() || m_node->op() == PutByValAlias)
-                    m_out.store(valueToStore, pointer, refType);
+                    m_out.store(valueToStore, pointer, storeType);
                 else {
                     LBasicBlock isInBounds = FTL_NEW_BLOCK(m_out, ("PutByVal typed array in bounds case"));
                     LBasicBlock continuation = FTL_NEW_BLOCK(m_out, ("PutByVal typed array continuation"));
@@ -3320,7 +3314,7 @@
                         unsure(continuation), unsure(isInBounds));
                     
                     LBasicBlock lastNext = m_out.appendTo(isInBounds, continuation);
-                    m_out.store(valueToStore, pointer, refType);
+                    m_out.store(valueToStore, pointer, storeType);
                     m_out.jump(continuation);
                     
                     m_out.appendTo(continuation, lastNext);
@@ -3330,7 +3324,6 @@
             }
             
             DFG_CRASH(m_graph, m_node, "Bad array type");
-#endif // FTL_USES_B3
             break;
         }
     }
@@ -3371,10 +3364,6 @@
     
     void compileArrayPush()
     {
-#if FTL_USES_B3
-        if (verboseCompilationEnabled() || !verboseCompilationEnabled())
-            CRASH();
-#else
         LValue base = lowCell(m_node->child1());
         LValue storage = lowStorage(m_node->child3());
         
@@ -3383,7 +3372,7 @@
         case Array::Contiguous:
         case Array::Double: {
             LValue value;
-            LType refType;
+            Output::StoreType storeType;
             
             if (m_node->arrayMode().type() != Array::Double) {
                 value = lowJSValue(m_node->child2(), ManualOperandSpeculation);
@@ -3391,13 +3380,13 @@
                     FTL_TYPE_CHECK(
                         jsValueValue(value), m_node->child2(), SpecInt32, isNotInt32(value));
                 }
-                refType = m_out.ref64;
+                storeType = Output::Store64;
             } else {
                 value = lowDouble(m_node->child2());
                 FTL_TYPE_CHECK(
                     doubleValue(value), m_node->child2(), SpecDoubleReal,
                     m_out.doubleNotEqualOrUnordered(value, value));
-                refType = m_out.refDouble;
+                storeType = Output::StoreDouble;
             }
             
             IndexedAbstractHeap& heap = m_heaps.forArrayType(m_node->arrayMode().type());
@@ -3415,7 +3404,7 @@
             
             LBasicBlock lastNext = m_out.appendTo(fastPath, slowPath);
             m_out.store(
-                value, m_out.baseIndex(heap, storage, m_out.zeroExtPtr(prevLength)), refType);
+                value, m_out.baseIndex(heap, storage, m_out.zeroExtPtr(prevLength)), storeType);
             LValue newLength = m_out.add(prevLength, m_out.int32One);
             m_out.store32(newLength, storage, m_heaps.Butterfly_publicLength);
             
@@ -3441,7 +3430,6 @@
             DFG_CRASH(m_graph, m_node, "Bad array type");
             return;
         }
-#endif
     }
     
     void compileArrayPop()
diff --git a/Source/JavaScriptCore/ftl/FTLOutput.cpp b/Source/JavaScriptCore/ftl/FTLOutput.cpp
index aa025ae..5d67dcb 100644
--- a/Source/JavaScriptCore/ftl/FTLOutput.cpp
+++ b/Source/JavaScriptCore/ftl/FTLOutput.cpp
@@ -26,13 +26,15 @@
 #include "config.h"
 
 #include "DFGCommon.h"
+#include "FTLB3Output.h"
 #include "FTLOutput.h"
 
 #if ENABLE(FTL_JIT)
-#if !FTL_USES_B3
 
 namespace JSC { namespace FTL {
 
+#if !FTL_USES_B3
+
 Output::Output(State& state)
     : IntrinsicRepository(state.context)
     , m_function(0)
@@ -177,8 +179,6 @@
 
 void Output::store(LValue value, TypedPointer pointer, LType refType)
 {
-    if (refType == refFloat)
-        value = buildFPCast(m_builder, value, floatType);
     LValue result = set(value, intToPtr(pointer.value(), refType));
     pointer.heap().decorateInstruction(result, *m_heaps);
 }
@@ -238,8 +238,63 @@
     check(condition, taken, taken.weight().inverse());
 }
 
+#endif // !FTL_USES_B3
+
+LValue Output::load(TypedPointer pointer, LoadType type)
+{
+    switch (type) {
+    case Load8SignExt32:
+        return load8SignExt32(pointer);
+    case Load8ZeroExt32:
+        return load8ZeroExt32(pointer);
+    case Load16SignExt32:
+        return load8SignExt32(pointer);
+    case Load16ZeroExt32:
+        return load8ZeroExt32(pointer);
+    case Load32:
+        return load32(pointer);
+    case Load64:
+        return load64(pointer);
+    case LoadPtr:
+        return loadPtr(pointer);
+    case LoadFloat:
+        return loadFloat(pointer);
+    case LoadDouble:
+        return loadDouble(pointer);
+    }
+    RELEASE_ASSERT_NOT_REACHED();
+    return nullptr;
+}
+
+void Output::store(LValue value, TypedPointer pointer, StoreType type)
+{
+    switch (type) {
+    case Store32As8:
+        store32As8(value, pointer);
+        return;
+    case Store32As16:
+        store32As16(value, pointer);
+        return;
+    case Store32:
+        store32(value, pointer);
+        return;
+    case Store64:
+        store64(value, pointer);
+        return;
+    case StorePtr:
+        storePtr(value, pointer);
+        return;
+    case StoreFloat:
+        storeFloat(value, pointer);
+        return;
+    case StoreDouble:
+        storeDouble(value, pointer);
+        return;
+    }
+    RELEASE_ASSERT_NOT_REACHED();
+}
+
 } } // namespace JSC::FTL
 
-#endif // !FTL_USES_B3
 #endif // ENABLE(FTL_JIT)
 
diff --git a/Source/JavaScriptCore/ftl/FTLOutput.h b/Source/JavaScriptCore/ftl/FTLOutput.h
index 448189b..36ba1ca 100644
--- a/Source/JavaScriptCore/ftl/FTLOutput.h
+++ b/Source/JavaScriptCore/ftl/FTLOutput.h
@@ -260,14 +260,43 @@
     LValue load32(TypedPointer pointer) { return load(pointer, ref32); }
     LValue load64(TypedPointer pointer) { return load(pointer, ref64); }
     LValue loadPtr(TypedPointer pointer) { return load(pointer, refPtr); }
-    LValue loadFloatToDouble(TypedPointer pointer) { return buildFPCast(m_builder, load(pointer, refFloat), doubleType); }
+    LValue loadFloat(TypedPointer pointer) { return load(pointer, refFloat); }
     LValue loadDouble(TypedPointer pointer) { return load(pointer, refDouble); }
-    void store16(LValue value, TypedPointer pointer) { store(value, pointer, ref16); }
+
+    enum LoadType {
+        Load8SignExt32,
+        Load8ZeroExt32,
+        Load16SignExt32,
+        Load16ZeroExt32,
+        Load32,
+        Load64,
+        LoadPtr,
+        LoadFloat,
+        LoadDouble
+    };
+
+    LValue load(TypedPointer, LoadType);
+    
+    void store32As8(LValue value, TypedPointer pointer) { store(intCast(value, int8), pointer, ref8); }
+    void store32As16(LValue value, TypedPointer pointer) { store(intCast(value, int16), pointer, ref16); }
     void store32(LValue value, TypedPointer pointer) { store(value, pointer, ref32); }
     void store64(LValue value, TypedPointer pointer) { store(value, pointer, ref64); }
     void storePtr(LValue value, TypedPointer pointer) { store(value, pointer, refPtr); }
+    void storeFloat(LValue value, TypedPointer pointer) { store(value, pointer, refFloat); }
     void storeDouble(LValue value, TypedPointer pointer) { store(value, pointer, refDouble); }
 
+    enum StoreType {
+        Store32As8,
+        Store32As16,
+        Store32,
+        Store64,
+        StorePtr,
+        StoreFloat,
+        StoreDouble
+    };
+
+    void store(LValue, TypedPointer, StoreType);
+
     LValue addPtr(LValue value, ptrdiff_t immediate = 0)
     {
         if (!immediate)