using cxx::types
-
Upload
jordan-delong -
Category
Technology
-
view
3.479 -
download
0
Transcript of using cxx::types
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 1 / 39
using cxx::types;
Jordan DeLongSoftware Engineer, Facebook
Overview
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 2 / 39
• C++ and its type system
• Weakly typed code
• Refactoring example
Preliminaries
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 3 / 39
Why does Facebook use C++?
Preliminaries
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 3 / 39
Why does Facebook use C++?
• Performance
◦ At scale: operational costs > engineering costs
◦ C++ gives programmers low-level control
Preliminaries
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 3 / 39
Why does Facebook use C++?
• Performance
◦ At scale: operational costs > engineering costs
◦ C++ gives programmers low-level control
• Abstraction tools
◦ Lambdas and higher-order functions
◦ Type deduction (auto, template arguments)
◦ Powerful type system
C++: Powerful Type System
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 4 / 39
• template<class T>
C++: Powerful Type System
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 4 / 39
• template<class T>
• template<int I>
C++: Powerful Type System
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 4 / 39
• template<class T>
• template<int I>
• template<template <class> class T>
C++: Powerful Type System
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 4 / 39
• template<class T>
• template<int I>
• template<template <class> class T>
• OO-style subtyping/polymorphism
C++: Powerful Type System
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 4 / 39
• template<class T>
• template<int I>
• template<template <class> class T>
• OO-style subtyping/polymorphism
Basics:
• New statically incompatible types can be created
C++: Powerful Type System
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 4 / 39
• template<class T>
• template<int I>
• template<template <class> class T>
• OO-style subtyping/polymorphism
Basics:
• New statically incompatible types can be created
• Function and operator overloading
C++: “Powerful” Type System?
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 5 / 39
• Error-prone standard conversions
◦ Nearly all primitive types convert to bool
◦ unsigned to signed
◦ narrowing conversions
◦ floating-integral conversions
C++: “Powerful” Type System?
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 5 / 39
• Error-prone standard conversions
◦ Nearly all primitive types convert to bool
◦ unsigned to signed
◦ narrowing conversions
◦ floating-integral conversions
• void*, unsigned char*, untyped memory
C++: “Powerful” Type System?
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 5 / 39
• Error-prone standard conversions
◦ Nearly all primitive types convert to bool
◦ unsigned to signed
◦ narrowing conversions
◦ floating-integral conversions
• void*, unsigned char*, untyped memory
• typedef only makes type aliases
◦ Creating real new types is more verbose
Strong vs. Weak
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 6 / 39
Usual definition:
A type system is “strong” if it disallows conversions
between values of different types.
Strong vs. Weak
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 6 / 39
Usual definition:
A type system is “strong” if it disallows conversions
between values of different types.
Many? Most? Unsafe? Implicit?
Strong vs. Weak
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 6 / 39
Usual definition:
A type system is “strong” if it disallows conversions
between values of different types.
Many? Most? Unsafe? Implicit?
In the context of static type systems:
• Unclear as a language property
Strong vs. Weak
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 6 / 39
Usual definition:
A type system is “strong” if it disallows conversions
between values of different types.
Many? Most? Unsafe? Implicit?
In the context of static type systems:
• Unclear as a language property
• Better: “strongly typed” is a property of code
Strong vs. Weak
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 6 / 39
Usual definition:
A type system is “strong” if it disallows conversions
between values of different types.
Many? Most? Unsafe? Implicit?
In the context of static type systems:
• Unclear as a language property
• Better: “strongly typed” is a property of code
• Weakly typed code can be written in a strongly typed language
What We Really Want
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 7 / 39
Goal: fewer bugs, guaranteed correctness.
What We Really Want
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 7 / 39
Goal: fewer bugs, guaranteed correctness.
• Move runtime errors to compile time
• One way to do this: strongly typed APIs
What We Really Want
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 7 / 39
Goal: fewer bugs, guaranteed correctness.
• Move runtime errors to compile time
• One way to do this: strongly typed APIs
What this means:
• Types are a critical part of interface design
What We Really Want
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 7 / 39
Goal: fewer bugs, guaranteed correctness.
• Move runtime errors to compile time
• One way to do this: strongly typed APIs
What this means:
• Types are a critical part of interface design
• Types should encode semantics
What We Really Want
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 7 / 39
Goal: fewer bugs, guaranteed correctness.
• Move runtime errors to compile time
• One way to do this: strongly typed APIs
What this means:
• Types are a critical part of interface design
• Types should encode semantics
• Primitive types are just building blocks (especially in C++)
Weakly Typed APIs
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 8 / 39
Weak: High-arity Functions
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 9 / 39
public void DrawRectangle(
Color colorOutline,
int thicknessOutline,
int x,
int y,
int width,
int height,
int xCornerRadius,
int yCornerRadius,
Color colorGradientStart,
int xGradientStart,
int yGradientStart,
Color colorGradientEnd,
int xGradientEnd,
int yGradientEnd,
UInt16 opacity
)
Weak: High-arity Functions
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 10 / 39
• Semantics are primarily encoded by position
Weak: High-arity Functions
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 10 / 39
• Semantics are primarily encoded by position
• Worse when arguments have compatible types
Weak: High-arity Functions
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 10 / 39
• Semantics are primarily encoded by position
• Worse when arguments have compatible types
• Temptation to use or add defaulted arguments
◦ Hinders refactoring
Weak: “Types” in Identifiers
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 11 / 39
void sleep(int seconds);
sleep(3600 * 60 * 2 /* two hours */);
Weak: “Types” in Identifiers
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 11 / 39
void sleep(int seconds);
sleep(3600 * 60 * 2 /* two hours */);
• Semantics are encoded in the identifier
Weak: “Types” in Identifiers
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 11 / 39
void sleep(int seconds);
sleep(3600 * 60 * 2 /* two hours */);
• Semantics are encoded in the identifier
• Use types that encode units, e.g. std::chrono::duration<>
using namespace std::chrono;
void sleep(seconds s);
sleep(duration_cast<seconds>(hours(2)));
Weak: Boolean Arguments
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 12 / 39
typedef int Id;
enum MetaKind { ... };
void addMeta(int pos,
MetaKind kind,
MetaData* mdata,
bool mIsVector,
Id id);
Particularly bad in C++: every other argument type here implicitly
converts to bool.
Refactoring: HHVM Assembler
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 13 / 39
An Assembler API
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 14 / 39
typedef int register_name_t;
const register_name_t rax = 0;
const register_name_t rbx = 1;
// ...
struct Asm {
// ...
void load_reg64_disp_reg32(int rbase, int disp,
int rdest);
void load_reg64_disp_reg64(int, int, int);
void sub_imm32_reg32(intptr_t, int);
void mov_imm64_reg(intptr_t, int);
void load_reg64_index_scale_disp_reg64(
int rbase, int rindex, int scale, int disp,
int rdest);
x64 Memory Operands
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 15 / 39
// *rax = rbx;
movq %rbx, (%rax) ; base + 0
// rax[0xc] = rbx;
movq %rbx, 0xc(%rax) ; base + disp
// rax[rcx*2+0xc] = 0x42;
movq $0x42, 0xc(%rax,%rcx,0x2) ; base + idx*2 + disp
General case: base + index * scale + displacement.
Using the API
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 16 / 39
void (*g_destructors[4])(void*) =
{destructString, destructArray,
destructObject, destructRef};
// Dispatch to appropriate destructor function
a. load_reg64_disp_reg32(rbx, TVOFF(m_type), rsi);
a. sub_imm32_reg32(KindOfString, rsi);
a. mov_imm64_reg(uintptr_t(&g_destructors), rax)
a. load_reg64_index_scale_disp_reg64(
rax, rsi, 8, 0, rax);
a. call_reg(rax);
Using the API
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 17 / 39
movl 0xc(%rbx), %esi
subl $0xf,%esi
movq $0x6a28240,%rax
movq (%rax,%rsi,8),%rax
callq *%rax
Misusing the API
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 18 / 39
void load_reg64_index_scale_disp_reg64(
int rbase, int rindex, int scale, int disp,
int rdest);
Possible Errors #1
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 19 / 39
a. load_reg64_disp_reg64(rVmFp, AROFF(m_this), rax);
a. store_reg64_disp_reg64(0,
AROFF(m_this), rVmFp);
a. shr_imm32_reg64(1, rax);
a. jcc(CC_NBE, decRefThisStub);
Possible Errors #1
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 20 / 39
a. load_reg64_disp_reg64(rVmFp, AROFF(m_this), rax);
a. store_reg64_disp_reg64(0,
AROFF(m_this), rVmFp); // <--
a. shr_imm32_reg64(1, rax);
a. jcc(CC_NBE, decRefThisStub);
Possible Errors #1
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 21 / 39
a. load_reg64_disp_reg64(rVmFp, AROFF(m_this), rax);
a. store_imm64_disp_reg64(0, // reg -> imm
AROFF(m_this), rVmFp); // <--
a. shr_imm32_reg64(1, rax);
a. jcc(CC_NBE, decRefThisStub);
Possible Errors #2
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 22 / 39
a. mov_imm32_reg32(-1, rax);
a. store_reg64_disp_reg64(rax,
0, rbx);
Possible Errors #2
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 23 / 39
a. mov_imm32_reg32(-1, rax);
a. store_reg64_disp_reg64(rax,
0, rbx); // not sign-extended
Possible Errors #2
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 24 / 39
a. mov_imm32_reg64(-1, rax); // 32 -> 64
a. store_reg64_disp_reg64(rax,
0, rbx); // not sign-extended
Improving on this
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 25 / 39
• Argument semantics we can encode in types:
◦ Registers vs. immediates
◦ How a register is used (value vs. part of memory operand)
◦ Operand sizes (eax vs. rax)
Improving on this
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 25 / 39
• Argument semantics we can encode in types:
◦ Registers vs. immediates
◦ How a register is used (value vs. part of memory operand)
◦ Operand sizes (eax vs. rax)
• Reduce function arity
Improving on this
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 25 / 39
• Argument semantics we can encode in types:
◦ Registers vs. immediates
◦ How a register is used (value vs. part of memory operand)
◦ Operand sizes (eax vs. rax)
• Reduce function arity
• Function naming: closer to x64 opcode mnemonics
Refactored API
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 26 / 39
// Dispatch to appropriate destructor function
a. movl (rbx[TVOFF(m_type)], esi);
a. subl (KindOfString, esi);
a. movq (&g_destructors, rax);
a. movq (rax[rsi*8], rax);
a. call (rax);
New Register Types
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 27 / 39
/// Reg64, Reg32, RegXMM, RegRIP ...
constexpr Reg64 rax(0);
constexpr Reg64 rcx(1);
constexpr Reg32 eax(0);
constexpr Reg32 ecx(1);
constexpr Reg8 al(0);
constexpr RegXMM xmm0(0);
constexpr RegRIP rip;
// etc
Register Type Details
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 28 / 39
struct Reg64 {
explicit constexpr Reg64(int);
explicit constexpr operator int() const;
constexpr bool operator==(Reg64) const;
constexpr bool operator!=(Reg64) const;
// ...
};
• We’re using a struct instead of enum class because we want to
define an operator[] later.
Register to Register mov
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 29 / 39
void movb(Reg8, Reg8); // 8-bit operands
void movl(Reg32, Reg32); // 32-bit operands
void movq(Reg64, Reg64); // 64-bit operands
Register to Register mov
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 29 / 39
void movb(Reg8, Reg8); // 8-bit operands
void movl(Reg32, Reg32); // 32-bit operands
void movq(Reg64, Reg64); // 64-bit operands
a. movl (rax, rbx); // compile-time error
Register to Register mov
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 29 / 39
void movb(Reg8, Reg8); // 8-bit operands
void movl(Reg32, Reg32); // 32-bit operands
void movq(Reg64, Reg64); // 64-bit operands
a. movl (rax, rbx); // compile-time error
a. movl (eax, ebx); // ok
Simple Memory Operands
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 30 / 39
// reg + offset
struct DispReg {
Reg64 rbase;
intptr_t disp;
};
DispReg operator+(Reg64 rbase, intptr_t disp);
Indexed Memory Operands
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 31 / 39
// reg * scale
struct ScaledIndex {
Reg64 rindex;
int scale;
};
ScaledIndex operator*(Reg64 rindex, int scale);
// reg + reg*scale + disp
struct IndexedDispReg {
Reg64 rbase;
ScaledIndex index;
intptr_t disp;
};
IndexedDispReg operator+(Reg64 rbase, ScaledIndex);
IndexedDispReg operator+(IndexedDispReg, intptr_t);
Dereferenced Memory Operands
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 32 / 39
// *(reg + offset)
struct MemoryRef {
DispReg dr;
};
MemoryRef operator*(DispReg);
// *(reg + reg*scale + disp)
struct IndexedMemoryRef {
IndexedDispReg dr;
};
IndexedMemoryRef operator*(IndexedDispReg);
Loads
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 33 / 39
void movq(MemoryRef, Reg64);
void movl(MemoryRef, Reg32);
void movq(IndexedMemoryRef, Reg64);
void movl(IndexedMemoryRef, Reg32);
Loads
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 33 / 39
void movq(MemoryRef, Reg64);
void movl(MemoryRef, Reg32);
void movq(IndexedMemoryRef, Reg64);
void movl(IndexedMemoryRef, Reg32);
a. movl(*(rbx + 0xc), rax); // compile-time error
Loads
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 33 / 39
void movq(MemoryRef, Reg64);
void movl(MemoryRef, Reg32);
void movq(IndexedMemoryRef, Reg64);
void movl(IndexedMemoryRef, Reg32);
a. movl(*(rbx + 0xc), rax); // compile-time error
a. movl(*(rbx + 0xc), eax); // ok
Stores
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 34 / 39
void movq(Reg64, MemoryRef);
void movl(Reg32, MemoryRef);
void movq(Reg64, IndexedMemoryRef);
void movl(Reg32, IndexedMemoryRef);
Stores
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 34 / 39
void movq(Reg64, MemoryRef);
void movl(Reg32, MemoryRef);
void movq(Reg64, IndexedMemoryRef);
void movl(Reg32, IndexedMemoryRef);
a. movq(rbx, rax[0xc]); // Reg64 gets an operator[]
Stores
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 34 / 39
void movq(Reg64, MemoryRef);
void movl(Reg32, MemoryRef);
void movq(Reg64, IndexedMemoryRef);
void movl(Reg32, IndexedMemoryRef);
a. movq(rbx, rax[0xc]); // Reg64 gets an operator[]
a. movq(ebx, rax[0xc]); // compile-time error
Other opcodes
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 35 / 39
// Load effective address:
void lea(IndexedDispReg, Reg64);
void lea(DispReg, Reg64);
// Push can take memory or registers:
void pushq(IndexedMemoryRef);
void pushq(MemoryRef);
void pushq(Reg64);
Discussion
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 36 / 39
• Registers 6≡ Memory Operands 6≡ Immediates 6≡ Registers
Discussion
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 36 / 39
• Registers 6≡ Memory Operands 6≡ Immediates 6≡ Registers
• Memory Operands: encoded as an embedded expression language
Discussion
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 36 / 39
• Registers 6≡ Memory Operands 6≡ Immediates 6≡ Registers
• Memory Operands: encoded as an embedded expression language
• Immediates:
void movq(Immed, Reg64);
void movl(Immed, Reg32);
void movb(Immed, Reg8);
◦ Thin, runtime-checked wrapper around intptr_t
◦ Still potentially vulnerable to runtime integer-related issues
Discussion
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 36 / 39
• Registers 6≡ Memory Operands 6≡ Immediates 6≡ Registers
• Memory Operands: encoded as an embedded expression language
• Immediates:
void movq(Immed, Reg64);
void movl(Immed, Reg32);
void movb(Immed, Reg8);
◦ Thin, runtime-checked wrapper around intptr_t
◦ Still potentially vulnerable to runtime integer-related issues
• Looks more like the assembly we’re trying to generate
;
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 37 / 39
Making immediates safer
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 38 / 39
void movimm(Immed immed, Reg64 reg);
struct Immed {
template<class T>
/* implicit */ Immed(
T i, typename std::enable_if<...>::type* = 0
) : m_int(/* ... */) {}
// various accessors q(), l(), w()
// fitsSigned(), fitsUnsigned()
};
Making immediates safer
By Jordan DeLong. c©2012- Facebook. Do not redistribute. 39 / 39
void movimm(Immed imm, Reg64 dest) {
if (imm.q() == 0) return xorl(r32(dest), r32(dest));
if (imm.q() > 0 && imm.fitsUnsigned(sz::dword)) {
return movl(imm, r32(dest)); // zeros top bits
}
movq(imm, dest);
}