C++中的类: 操作符重载

操作符重载

函数重载与操作符重载对比

函数重载(静态多态)：同名，形参不同，由编译器在编译期根据参数列表选择具体函数版本，实现“一名多用”的静态多态(与继承+虚函数的“动态多态”不同)。
操作符重载：

允许程序员：
- 内建类型之间已有的运算符(+ - * / >>等)
- 自定义类型，重新定义运算含义(但语法/优先级/结合性保持不变)
作用如下：
- 提高可读性：c = a + b比c = a.add(b)更直观
- 提高可扩展性：让自定义类型“像内建类型一样用”

例子(思路对比)：

class Complex {
    double real, imag;
public:
    Complex(double r=0, double i=0) : real(r), imag(i) {}
    Complex add(const Complex& x); // 普通成员函数
};
Complex a(1,2), b(3,4), c;
c = a.add(b); // 普通调用

// 改成操作符重载
class Complex {
    double real, imag;
public:
    Complex(double r=0, double i=0) : real(r), imag(i) {}

    // 方式1：成员函数
    Complex operator+(const Complex& x) const {
        Complex tmp;
        tmp.real = real + x.real;
        tmp.imag = imag + x.imag;
        return tmp;
    }

    // 方式2：友元的非成员函数
    friend Complex operator+(const Complex& c1, const Complex& c2);
};

Complex operator+(const Complex& c1, const Complex& c2) {
    Complex tmp;
    tmp.real = c1.real + c2.real;
    tmp.imag = c1.imag + c2.imag;
    return tmp;
}

Complex a(1,2), b(3,4), c;
c = a + b; // 等价于 a.operator+(b) 或 operator+(a,b)

注意两点：

operator+的参数中至少有一个是用户自定义类型(否则会和内建运算冲突；new/delete例外)。
写成运算符形式：易理解、优先级/结合性自然正确(复用语言已有规则)。

操作符重载基本原则与常见的操作符重载

可重载的运算符列表：像., ::, ?:, sizeof这类不能重载；其他大部分算术、关系、位运算、[], ->, ++, --都可以。
方式：
- 成员函数
- 带有“类的类型参数”的全局函数(包括友元)
必须遵循原有语法规则：
- 单目/双目的“目数”不能变
- 运算符的优先级、结合性不能变

双目操作符重载

成员函数写法：
1
2
3
4
5
6
7
class T {
public:
ret_type operator#(const T& rhs); // 双目：lhs # rhs
};

T a, b;
a # b; // 编译器翻译为 a.operator#(b);
对于双目运算符(如+, -, *, ==等)，若写成成员函数，左操作数是隐含的this，右操作数是显式参数。
全局函数写法：
1
2
3
4
5
// 友元声明在类里
friend ret_type operator#(arg1, arg2);

// 在命名空间/全局实现
ret_type operator#(arg1, arg2);
限制：=, (), []不能作为全局函数进行重载，只能是成员。因为这些运算符需要直接访问/改变对象内部状态，语言规定必须由类自己提供。

为什么还需要全局函数版的operator+？

class CL {
    int count;
public:
    friend CL operator+(int i, CL& a); // 支持 10 + obj
    friend CL operator+(CL& a, int i); // 支持 obj + 10
};

obj + 10 // 可以用成员函数：obj.operator+(10)
10 + obj // 左边不是类对象，成员函数没法支持

说明：

成员函数重载operator+时，左操作数必须是该类对象；
如果希望10 + obj这种写法，就必须用非成员(通常是友元)函数来重载。

不要重载&&和||：

char* p;
if ((p != 0) && (strlen(p) > 10)) ...

// 如果你重载了 &&
if (expression1 && expression2) ...
// 其实会被视作
if (expression1.operator&&(expression2)) ...
// 或
if (operator&&(expression1, expression2)) ...

说明：

内建的&&/||有短路求值：左边为假时右边根本不求值；
一旦你自定义operator&&/operator||，就会变成普通函数调用，无法实现同样的短路语义，代码含义会非常诡异。
不能重载?:的原因和这个是类似的。

操作符重载的设计

在追求效率时，不能返回引用而破坏语义/安全。

class Rational {
public:
    Rational(int, int);
    const Rational& operator*(const Rational& r) const;
private:
    int n, d;
};

return Rational(n * r.n, d * r.d); // 悬空引用，未定义行为

Rational *result = new Rational(n * r.n, d * r.d);
return result; // 返回引用指向堆对象，但没人管它，泄漏

static Rational result;
result.n = n * r.n; result.d = d * r.d; return result; // 所有乘法共用一个静态对象，a*b 的结果会在下一次调用中被覆盖；多线程也不安全。

// 正确写法
Rational operator*(const Rational&) const; // 按值返回，交给RVO来优化

单目操作符重载

成员函数写法：

class T {
public:
    ret_type operator#(); // 隐含 this，当作一元运算：obj # 或 #obj
};

全局函数写法：

1	ret_type operator#(T arg); // 把对象当作显式参数

a++vs++a的重载

class Counter {
    int value;
public:
    Counter() { value = 0; }

    // 前缀 ++a
    Counter& operator++() {
        value++;
        return *this; // 返回引用：++a 是左值
    }

    // 后缀 a++（靠一个 dummy int 区分）
    Counter operator++(int) {
        Counter temp = *this; // 先保存旧值
        value++;
        return temp; // 返回旧值：a++ 是右值
    }
};

要点：

前缀：Counter& operator++()，返回引用，效率高，可作左值；
后缀：Counter operator++(int)，参数里的int只是“占位用来区分语法”，返回值通常是按值拷贝（旧值）。

特殊操作符重载

=

如果类里没有自己写operator()=，编译器会自动生成一个“默认赋值运算符”：逐个成员赋值；含有对象成员的类递归进行赋值。
赋值操作符重载不能继承：子类和基类的成员不完全一致，基类自己写的operator()=不会自动传给子类；子类要么自己再显示定义一个要么接受编译器自动生成的。
返回值：一般写成T& operator()=(const T& rhs);，这样可以支持链式赋值：a = b = c;，先算b = c，得到b的引用再赋给a。

有指针成员时的赋值：

class A {
    int x, y;
    char* p;
public:
    A(int i, int j, const char* s) : x(i), y(j) {
        p = new char[strlen(s) + 1];
        strcpy(p, s);
    }
    virtual ~A() { delete[] p; }
    
    // 错误写法：自赋值的时候一定炸
    A& operator=(A& a) {
        x = a.x; y = a.y;
        delete []p; // 不要先进行删除
        p = new char[strlen(a.p) + 1]; // 如果new失败了会出现悬空指针的隐患
        strcpy(p, a.p);
        return *this;
    }

    // 正确写法
    A& operator=(const A& a) {
        char* pOrig = p; // 记住旧指针
        p = new char[strlen(a.p) + 1]; // 先分配新空间
        strcpy(p, a.p); // 拷贝内容
        delete[] pOrig; // 再释放旧空间
        x = a.x; y = a.y;
        return *this;
    }
};

避免自赋值：
赋值运算符里要考虑自赋值的情况：

Widget& Widget::operator=(const Widget& rhs) {
    if (this == &rhs) // identity test：左右指向同一个对象
        return *this; // 什么都不做，直接返回

    delete pb;  
    pb = new Bitmap(*rhs.pb); 
    return *this;
}

this == &rhs为真，说明是s = s;这种自赋值，直接返回最安全，也省事。如果不做检查：先delete pb;把自己数据删了然后再从rhs.pb(已经被删了)中拷贝，行为未定义会抛出异常。

或者是改成下面的写法也可以直接避免自赋值：

Widget& Widget::operator=(const Widget& rhs) {
    Bitmap* pOrig = pb;
    pb = new Bitmap(*rhs.pb); // 不管是不是自赋值都拿新地址
    delete pOrig;
    return *this;
}

[]

读写vs只读：

class string {
    char* p;
public:
    string(const char* p1) {
        p = new char[strlen(p1)+1];
        strcpy(p, p1);
    }

    char& operator[](int i) { // 非 const 对象
        return p[i]; // 返回 char&，可修改
    }

    const char operator[](int i) const { // const 对象
        return p[i]; // 返回值，只读
    }

    virtual ~string() { delete[] p; }
};

string s("aacd");
s[2] = 'b'; // OK，可写
const string cs("const");
cout << cs[0]; // 读 OK
cs[0] = 'D'; // 编译错误：const 对象只能调用 const 版本，只读

给operator[]通常提供两套重载：

非const版本：返回非常量引用，支持写操作
const版本：加上尾部const，对const对象调用，只允许读操作

多维数组：
希望通过operator[]来实现data[i][j]。

底层实现：

Array2D其实只持有一个int *p，指向n1 * n2个int的一维数组
用公式i * n2 + j把二维下标变成一维下标

class Array2D {
public:
    class Array1D { 
    public:
        Array1D(int* p) : p(p) {}
        int& operator[](int index) { return p[index]; }
        const int operator[](int index) const { return p[index]; }
    private:
        int* p;
    };

    Array2D(int n1, int n2) {
        p = new int[n1*n2];
        num1 = n1; num2 = n2;
    }
    ~Array2D() { delete[] p; }

    Array1D operator[](int index) { // data[i]
        return Array1D(p + index * num2); // 指向第 i 行起点
    }
    const Array1D operator[](int index) const {
        return Array1D(p + index * num2);
    }

private:
    int* p;
    int  num1, num2;
};

调用链如下：

data[i]->Array2D::operator[]，返回一个Array1D对象(第i行)
data[i][j]->data[i].operator[](j)，调用的是Array1D::operator[]，返回data[i][j]值的引用

()

函数对象：给类重载operator()以后，这个类的对象就可以像函数一样调用：

class Array2D {
    int n1, n2;
    int* p;
public:
    Array2D(int l, int c) : n1(l), n2(c) {
        p = new int[n1 * n2];
    }
    virtual ~Array2D() { delete[] p; }

    int& operator()(int i, int j) {
        return p[i * n2 + j];
    }
};

Array2D a(2, 3);
a(1, 2) = 0;

作为比较器：以sort为例

bool cmpInt(int a, int b) { return a < b; } // 普通函数

class CmpInt {
public:
    bool operator()(int a, int b) const {    // 函数对象
        return a < b;
    }
};

int main() {
    std::vector<int> items { 4, 3, 1, 2 };
    std::sort(items.begin(), items.end(), cmpInt); // 函数指针
    std::sort(items.begin(), items.end(), CmpInt()); // 函数对象
    std::sort(items.begin(), items.end(), [](int a, int b) { return a < b; }); // lambda
}

std::sort的第三个参数是Compare comp，只要写comp(x,y)能编译就行；
能这样调用的东西叫“可调用对象”：
- 普通函数(传函数指针)；
- 重载了operator()的对象(functor / 函数对象)；
- lambda：编译器自动生成一个匿名类，内部也是operator()。

这就是为什么operator()被称为“函数调用运算符”：重载它，可以让你的对象“像函数一样被调用”

函数对象

函数指针：

只能指向一个普通函数
自己不能带状态(例如系数、阈值)
编译器对它做内联优化比较受限

函数对象(重载operator()的类)：

是对象，可以有成员变量(能带状态)
是类，编译器可以内联调用，性能往往更好

例子：

class Multiply {
    int factor;
public:
    Multiply(int f) : factor(f) {}
    int operator()(int x) const { return x * factor; }
};

Multiply times2(2);
Multiply times5(5);

cout << times2(10); // 20
cout << times5(10); // 50

Lambda函数

lambda只是创建函数对象的一种语法糖，编译器会自动生成一个匿名函数对象类。

例子：

auto add5 = [base = 5](int x) { return x + base; };
cout << add5(10);  // 15

// 编译器实际上会生成一个匿名类，大致如下
class __lambda_add5 {
    int base;
public:
    __lambda_add5(int _base) : base(_base) {}
    int operator()(int x) const { return x + base; }
};

用数学里的λ演算对比C++的lambda：
对比图

捕获列表[]不同写法的含义：

捕获列表写法	含义
`[]`	不捕获任何外部变量
`[&]`	以引用方式捕获所有在当前作用域中被用到的外部变量
`[=]`	以值拷贝方式捕获所有在当前作用域中被用到的外部变量
`[=, &foo]`	默认按值捕获需要的变量，但 `foo` 这个变量以引用方式捕获
`[bar]`	仅按值捕获 `bar`，不捕获任何其他变量

Lambda使用例子(过滤器)：

vector<string> str_filter(vector<string>& vec,
                          function<bool(string&)> matched) {
    vector<string> result;
    for (string tmp : vec) {
        if (matched(tmp)) // 调用“匹配谓词”
            result.push_back(tmp);
    }
    return result;
}

int main() {
    vector<string> vec = {"www.baidu.com", "www.kernel.org", "www.google.com"};
    string pattern = ".com";

    vector<string> filtered = str_filter(
        vec,
        [&](string& str) { // lambda，当作函数对象传进去
            if (str.find(pattern) != string::npos) // 按引用捕获外部变量 pattern
                return true;
            return false;
        }
    );
}