Software Engineer

位操作 - 汉明距离

求两个整数的汉明距离 hamming distance Leetcode 461 两个整数之间的汉明距离是该两个数之间不同的位数。给定两个整数x和y，计算汉明距离。问题也可以理解为对于两个整数m和n, 需要改变m的二进制多少位才能得到n: /** Use Brian Kernighan's way to count bits */ public int hammingDistance(int x, int y) { x = x ^ y; y = 0; while(x != 0){ y++; x &= x - 1; } return y; } public class Solution { public int hammingDistance(int x, int y) { return Integer.bitCount(x ^ y); } } 同样用到Brian Kernighan算法： ...

位操作 - 不使用加减符号求和整数

不使用加减符号求和整数不能使用+和-, 仅通过^和&操作来求和两个整数a. 参考每位相加可能会产生进位(carry), 所以可以把相加拆分为两部分, 如759 + 674可以拆分为不考虑进位的部分323和仅考虑进位的部分1110, 故759 + 674 = 323 + 1110 = 1433. 二进制的加法也是从低位开始逐步往高位计算: 进行一位二进制的加法, 也就是暂不考虑进位的位相加: 0+0=0， 0+1=1, 1+0=1， 1+1=0, 那么就是^操作. 所得的和作为新的a. 求进位: 通过a & b判断是否进位, 因为只有两个位均为1才会进位. 所得的进位左移一位作为新的b. 不断重复这个过程, 把低位的进位传递到高位, 累加到a中, 直到进位为0, 最后得到的a就是答案. public class Solution { public int getSum(int a, int b) { while (b != 0) { // 关键在于判断终止的时机 int c = a & b; //carry a ^= b; //add b = c << 1; } return a; } } 涉及的运算就是一个多位二进制加法真值表：(对应于硬件中的全加器) ...

位操作 - 风骚的走位操作

通过位移实现很多风骚的操作，参考这个视频。检查一个数是否是偶数, 本质上就是取最后一位来判断, 如果是1那么就一定是奇数, 反之则为偶数: (x & 1) == 0 Check if power of two: (x & x - 1) == 0 因为如果数x是以2底的真数, 那么其二进制一定只有一个位置是1, 如0b1000, 那么x-1就会变成只有该位置是0其右边所有位变为1, 即0b0111, 也就是说这种情况下x和x-1所有位置都互异. 那么它们的位与运算就是x & x - 1 = 0b0000. x & x - 1的广义用途是求x二进制中1的个数, Counting bits set: unsigned int v; // count the number of bits set in v unsigned int c; // c accumulates the total bits set in v for (c = 0; v; c++) { v &= v - 1; // clear the least significant bit set } Brian Kernighan’s algorithm takes O(log N) to count set bits (1s) in an integer: each iteration sets the least significance bit that isn’t zero to zero - and only it. Since each iteration converts exactly bit from 1 to 0, it’ll take as many iterations as there are non-0 bits to convert all the bits to 0(and thus v == 0 and the loop finishes). An integer n has log(n) bits, hence the worst case is O(log(n)) ...

位操作 - 基础的位运算

一些常规的操作，参考这个视频。基本位操作把某一位变为1： def set_bit(x, position): mask = 1 << position return x | mask bin(set_bit(0b110, 0b101)) 输出0b100110. 因为x = 0b110 = 6, 翻转第五位，就用position = 0b101 = 5，得到mask = 0b00100000, 用|把第五位变为1. 清除某一位（1变0)： def clear_bit(x, position): mask = 1 << position return x & ~mask 通过XOR^和1来翻转某一位： def flip_bit(x, position): mask = 1 << position return x ^ mask 通过&1可以作为取位操作, 来判断某一位是否是1: def is_bit_set(x, position): shifted = x >> position return shifted & 1 0b1100110 >> 0b101 = 0b11, 0b11 & 0b01 = 1 根据参数state来控制修改某一位, 如果参数是1那么就是set, 如果是0那么就是clear: def modify_bit(x, position, state): mask = 1 << position return (x & ~mask) | (-state & mask) 如果state = 0b1, -state = 0b11111111 如果state = 0b0, -state = 0b0

位操作 - 二进制操作符

在很多语言中，字符char类型是八位, 那么可能取值有256种(-128 ~ -1, 0 ~ 127). 但是用二进制表示为0000 0000 ~ 1111 1111, 无符号整数的全部位都表示数值，而有符号数的最高位是符号位（0表示正数，1表示负数），所以实际表达数值的只剩下n-1位。这样理论上char的取值应该是1111 1111 = -127到0111 1111 = 127. 而-128 = 1 1000 0000需要9位来表达, 所以char是如何仅仅通过八位表达-128? 首先, 因为计算机只能做加法, 所以减法操作要转化为加法, 尝试将符号位参与运算, 1-1就转化为1 + (-1), 用二进制表达为0000 0001 + 1000 0001 = -2, 很明显是错的. 如果用原码表示, 让符号位也参与计算, 显然对于减法来说, 结果是不正确的. 这也就是为何计算机内部不使用原码表示一个数. 为了避免这种错误, 引入反码(正数的反码是其本身, 负数的反码是符号位不变, 其余位取反), 用-1的原码1000 0001的反码1111 1110来表达-1, 这样1 + (-1) = [0000 0001]反 + [1111 1110]反 = [1111 1111]反, 转为原码1000 0000 = -0. 发现用反码计算减法, 结果的真值部分是正确的. ...

Algorithms 03 - Memory 内存

Memory Bit. 0 or 1. Byte. 8 bits. Megabyte (MB). 1 million or $2^{20}$ bytes. Gigabyte (GB). 1 billion or $2^{30}$ bytes. 64-bit machine. We assume a 64-bit machine with 8 byte pointers (References). ・Can address more memory. ・Pointers use more space (some JVMs “compress” ordinary object pointers to 4 bytes to avoid this cost). Typical memory usage for primitive types and arrays primitive types (bytes): boolean 1 byte 1 char 2 int 4 float 4 long 8 double 8 ...

Algorithms 02 - Amortized Analysis 平摊分析

假如有两种交税方式：每天付 3 金币每次付的金币呈指数级增长，但通知付款频率呈指数级下降第1天：付 1 第2天：付 2 (累计 3) 第4天：付 4 (累积 7) 第8天：付 8 (累积 15) 哪种付的钱比较少？第二种比较划算，本质上等同于每天付 2，就是amortized constant。 A more rigorous examination of amortized analysis is done here, in three steps: Pick a cost model (like in regular runtime analysis) Compute the average cost of the i’th operation Show that this average (amortized) cost is bounded by a constant. 类似的应用在Array list 扩容中提到的 geometric resizing 方法(实际也是Python list 使用的方法)有体现, 所以使用一个因数来扩容数组, 可以让 ArrayList 的 add操作变为 amortized constant time. ...

Algorithms 01 - Asymptotic Analysis 渐进分析

Resource and Reference: CS61B Berkeley - Josh Hug Algorithms Princeton - ROBERT SEDGEWICK, KEVIN WAYNE 效率来源两个方面: 编程成本: 开发程序需要多长时间？代码是否容易阅读，修改和维护（大部分成本来自维护和可扩展性）？运行成本: 程序需要多长时间运行 (Time complexity)？需要多少内存 (Space complexity)？ Asymptotic Analysis Care about what happens for very large N (asymptotic behavior). We want to consider what types of algorithms would best handle scalability - Algorithms that scale well have better asymptotic runtime behavior. Simplification Summary Only consider the worst case. Pick a representative operation (aka: cost model) Ignore lower order terms Ignore multiplicative constants. Simplified Analysis Process ...

Stanford CS106A/B Programming Intro 斯坦福大学编程入门课

Stanford CS106B Programming Abstractions 和 CS106A 的学习笔记. 课程作业(cs106b spring 2017)实现代码见 https://github.com/ShootingSpace/cs106b-programming-abstraction Topics: A: Intro (by Java) B: Recursion, algorithms analysis (sort/search/hash), dynamic data structures (lists, trees, heaps), data abstraction (stacks, queues, maps), implementation strategies/tradeoffs Purposes become acquainted with the C++ programming language learn more advanced programming techniques explore classic data structures and algorithms and apply these tools to solving complex problems Reference Text Book: Data Structures & Algorithm Analysis in C++, 4th ed, by Mark A. Weiss Text Book: Programming Abstractions in C++ 1st Edition by Eric Roberts Text Book: Algorithms, 4th Edition Blog: Red Blob Games, Amit’s A* Pages Coding style Why writing clean, well-structured code ...

Java Hash @Override equals() hashcode()

主要介绍： Hashcode（哈希码）与 equals（判断相等）的关系 Hashcode 方法的底层实现原理开发中需要掌握的原则和方法 HashSet, HashMap, HashTable HashSet底层是调用HashMap. HashMap 使用hashCode和equals来进行对象比较。拿HashSet和add()举例(其余的数据结构,和 remove, contains等方法类似): 假设HashSet里面已经有了obj1, 那么当调用HashSet.add(obj2)时: if (obj1 == obj2), 那么没有必要调用 hashCode(), 已经有了这个对象, 没必要添加了 else, if hashCode 不同，那么可以直接添加了, 没必要进一步调用 obj1.equals(obj2) 来判断对象是否相等 else hashCode 相同，那么需要进一步调用obj1.equals(obj2) 下面这段代码虽然 HashSet 只存了 a 对象，但当检查是否包含 b 对象时，返回true。 HashSet<String> wordSet = new HashSet<String>(); String a = "hello"; String b = "hello"; wordSet.add(a); return wordSet.contains(b); // return true 根据Javadoc for Set. adds the specified element e to this set if the set contains no element e2 such that (e==null ? e2==null : e.equals(e2)). ...