Hycz's Blog

Life is a game. Why so serious?

Monthly Archives: November 2011

Cassandra 0.8.0流程分析(1)——CassandraDaemon启动流程

一、入口

入口方法是CassandraDaemon中的main方法,里面只有一个CassandraDaemon().activate()的调用。

    public static void main(String[] args)
    {
        new CassandraDaemon().activate();
    }

这里调用的是AbstractCassandraDaemon中的方法。Cassandra虽然是用java写的,但是确实如我从前看到的一篇文章所说的那样,很有C的风格。具体来说,类中的static变量比比皆是,还有就是代码中常用static{}代码块,这个就是用来在第一次调用到这个类时一定会执行到的部分,于是,这就成了C语言中的面向过程编程了。而且由于步进时是不会自动在这些地方停留的,于是,必须手动在这个代码块中设置断点,否则总是直接执行完了,对于我们这种废柴读代码者来说很是难过,不过习惯了也就好了。

二、setup

继续刚才的过程,AbstractCassandraDaemon首先启动了log4j的相关功能,设置了定期检查配置更新。然后正式进入到activate()方法中。

这个方法做了2件事,也就是2步,第一步是初始化daemon,第二步是启动daemon。

初始化的工作其实是调用AbstractCassandraDaemon中的setup()方法。这其实是个非常繁复的过程。注意下面的类的初始化基本都包括了log4j的Logger注册,和Mbean的注册,所以就不单独写了。

1、设置logger

2、检查CLibrary

3、DatabaseDescriptor初始化

全名org.apache.cassandra.config.DatabaseDescriptor。很类似,这个类的初始化也是放在了static中,说的笼统一点,这里就是把各种配置文件读取了,然后存放在相应的变量中。但是如果真的要分的话,还是有2个部分:

第1部分是读取cassadnra.yaml,将其中的配置参数导入到各个变量中,整个过程中还涉及到对参数值的合法性检查,根据设置的类名启动相应的类或建立相应的对象等等。

第2部分是建立各种system tables,这些系统表的配置是直接写在代码中的(hardcode),而不是从配置中读取。具体来说,就是建立了一个名为system的keyspace的MetaData,然后其中包括以下的columnFamily:

  • StatusCf:persistent metadata for the local node,本节点的持久化元数据
  • HintsCf:hinted handoff data,便签式提交数据
  • MigrationsCf:individual schema mutations,单个(?)schema改变
  • SchemaCf:current state of the schema,当前schema状态
  • IndexCf:indexes that have been completed,已完成索引
  • NodeIdCf:nodeId and their metadata,节点ID和相应元数据

相关的代码如下:

            // Hardcoded system tables
            KSMetaData systemMeta = new KSMetaData(Table.SYSTEM_TABLE,
                                                   LocalStrategy.class,
                                                   KSMetaData.optsWithRF(1),
                                                   CFMetaData.StatusCf,
                                                   CFMetaData.HintsCf,
                                                   CFMetaData.MigrationsCf,
                                                   CFMetaData.SchemaCf,
                                                   CFMetaData.IndexCf,
                                                   CFMetaData.NodeIdCf);
            CFMetaData.map(CFMetaData.StatusCf);
            CFMetaData.map(CFMetaData.HintsCf);
            CFMetaData.map(CFMetaData.MigrationsCf);
            CFMetaData.map(CFMetaData.SchemaCf);
            CFMetaData.map(CFMetaData.IndexCf);
            CFMetaData.map(CFMetaData.NodeIdCf);
            tables.put(Table.SYSTEM_TABLE, systemMeta);

从这里也可以看出一个keyspace的元数据需要包括哪些内容(名,副本策略,副本策略参数,ColumnFamily的元数据)。注意这里仅仅是建立了相关的元数据,并没有真正生成相应的keyspace的对象。

这个类的初始化很重要,是整个daemon启动的基础。

4、StorageService初始化

这个类的全名是org.apache.cassandra.service.StorageService,实际上是在DatabaseDescriptor初始化的过程中调用到了这个类的方法, 于是在第一次调用时初始化了。StorageService的初始化包括3个部分。

第一个部分定义了一系列的谓词(VERBS)和谓词阶段(verbStages),如下:

/* All verb handler identifiers */
    public enum Verb
    {
        MUTATION,
        BINARY,
        READ_REPAIR,
        READ,
        REQUEST_RESPONSE, // client-initiated reads and writes
        STREAM_INITIATE, // Deprecated
        STREAM_INITIATE_DONE, // Deprecated
        STREAM_REPLY,
        STREAM_REQUEST,
        RANGE_SLICE,
        BOOTSTRAP_TOKEN,
        TREE_REQUEST,
        TREE_RESPONSE,
        JOIN, // Deprecated
        GOSSIP_DIGEST_SYN,
        GOSSIP_DIGEST_ACK,
        GOSSIP_DIGEST_ACK2,
        DEFINITIONS_ANNOUNCE,
        DEFINITIONS_UPDATE_RESPONSE,
        TRUNCATE,
        SCHEMA_CHECK,
        INDEX_SCAN,
        REPLICATION_FINISHED,
        INTERNAL_RESPONSE, // responses to internal calls
        COUNTER_MUTATION,
        // use as padding for backwards compatability where a previous version needs to validate a verb from the future.
        UNUSED_1,
        UNUSED_2,
        UNUSED_3,
        ;
        // remember to add new verbs at the end, since we serialize by ordinal
    }
    public static final Verb[] VERBS = Verb.values();

    public static final EnumMap<StorageService.Verb, Stage> verbStages = new EnumMap<StorageService.Verb, Stage>(StorageService.Verb.class)
    {{
        put(Verb.MUTATION, Stage.MUTATION);
        put(Verb.BINARY, Stage.MUTATION);
        put(Verb.READ_REPAIR, Stage.MUTATION);
        put(Verb.READ, Stage.READ);
        put(Verb.REQUEST_RESPONSE, Stage.REQUEST_RESPONSE);
        put(Verb.STREAM_REPLY, Stage.MISC); // TODO does this really belong on misc? I've just copied old behavior here
        put(Verb.STREAM_REQUEST, Stage.STREAM);
        put(Verb.RANGE_SLICE, Stage.READ);
        put(Verb.BOOTSTRAP_TOKEN, Stage.MISC);
        put(Verb.TREE_REQUEST, Stage.ANTI_ENTROPY);
        put(Verb.TREE_RESPONSE, Stage.ANTI_ENTROPY);
        put(Verb.GOSSIP_DIGEST_ACK, Stage.GOSSIP);
        put(Verb.GOSSIP_DIGEST_ACK2, Stage.GOSSIP);
        put(Verb.GOSSIP_DIGEST_SYN, Stage.GOSSIP);
        put(Verb.DEFINITIONS_ANNOUNCE, Stage.READ);
        put(Verb.DEFINITIONS_UPDATE_RESPONSE, Stage.READ);
        put(Verb.TRUNCATE, Stage.MUTATION);
        put(Verb.SCHEMA_CHECK, Stage.MIGRATION);
        put(Verb.INDEX_SCAN, Stage.READ);
        put(Verb.REPLICATION_FINISHED, Stage.MISC);
        put(Verb.INTERNAL_RESPONSE, Stage.INTERNAL_RESPONSE);
        put(Verb.COUNTER_MUTATION, Stage.MUTATION);
        put(Verb.UNUSED_1, Stage.INTERNAL_RESPONSE);
        put(Verb.UNUSED_2, Stage.INTERNAL_RESPONSE);
        put(Verb.UNUSED_3, Stage.INTERNAL_RESPONSE);
    }};

第二个部分是建立了2个线程池,一个是用来处理短任务,一个是用来处理非周期性长任务,如下:

    /**
     * This pool is used for periodic short (sub-second) tasks.
     */
     public static final RetryingScheduledThreadPoolExecutor scheduledTasks = new RetryingScheduledThreadPoolExecutor("ScheduledTasks");

    /**
     * This pool is used by tasks that can have longer execution times, and usually are non periodic.
     */
    public static final RetryingScheduledThreadPoolExecutor tasks = new RetryingScheduledThreadPoolExecutor("NonPeriodicTasks");

第三个部分是实例化了一个标记为public static final的StorageService对象。在实例化的过程中,首先注册了谓词处理(register the verb handlers),然后又初始化了一个org.apache.cassandra.streaming.StreamingService(是的,层层调用。。。),不过这个类的初始化仅仅是做了实例化,实例化的过程中除了注册Mbean什么都没做。。。

5、获取网络地址和端口(address and port)

6、ColumnFamilyStore初始化

同样的,这个类的初始化是在第一次调用到这个类时完成的,过程是完成所有static的定义和static块中代码的执行。

带有static的定义有3个,分别是

  • flushSorter :一个用来执行flush任务中的排序阶段的线程池,由于是CPU密集型(CPU-bound),所以会根据处理器数量的线程。
  • flushWriter :在排序之后,由此线程池进行写磁盘,由于是磁盘密集型(disk-bound),所以可以和flushSourter同时运作。注意,flushSorter和flushWriter处理的是Memtable和BinaryMemtable的flush,对于BinaryMemtable的flush,这两个线程池已经足够了,这两个线程池都是private的
  • postFlushExecutor : 不同于上面两个线程池,这个是用来处理live Memtable的flush。live Memtable的flush更复杂一些,需要switchMemtable做额外的2件事(这里只会由switchMemtable去调用submitFlush):第一,将这个Memtable放到memtablesPendingFlush中,直到flush完成,并且它已经被转为SSTableReader,加到了ssTable_中;第二,等到flush完成后,在commitLogUpdater中加入一项标记,markCompacted,调用onMemtableFlush。这允许在多核系统中多个flush同时进行,并且以正确的顺序调用onMemtableFlush,正确的顺序对于replay很重要,否就就要restart,因为当onMemtableFlush被调用时,是假设在给定位置之上的内容都已经被固化到SSTable中了。注意到这个线程池是public的,所以是会被别的类调用的,在这个类中有2处调用,一处是添加已flush的标记到commit log的header中,一处是添加已compact的标记到commitlogUpdater中。

static块中,则从StorageService.tasks线程池中建立一个线程,设置延迟1秒开始,两次执行间隔为1秒,执行的内容是一个org.apache.cassandra.db.MeteredFlusher对象。

7、MeteredFlusher线程的定期运作

首先,先获取非活跃Memtable的大小。

然后就是flush的操作,flush的过程分2部分,设m是管线pipeline中能够存在的最大Memtable数量,第一部分则是将所有已使用内存超过已分配内存的1/m的ColumnFamily进行flush,第二部分则是对剩下的Memtable根据大小进行排序,对超过阀值的进行flush

8、清洗系统表的目录

这一步是对系统表所在的目录做清理,保证作为系统基石的系统表不会出错,可以理解,系统有可能在任何时候down机,于是固化到硬盘上的数据文件也就可能处于各种各样的脏数据状态,这里就是为了纠正这些可能的错误,虽然手段很粗暴。具体来说,就是对每个系统表调用ColumnFamilyStore.scrubDataDirectories方法,代码如下:

        // check the system table to keep user from shooting self in foot by changing partitioner, cluster name, etc.
        // we do a one-off scrub of the system table first; we can't load the list of the rest of the tables,
        // until system table is opened.
        for (CFMetaData cfm : DatabaseDescriptor.getTableMetaData(Table.SYSTEM_TABLE).values())
            ColumnFamilyStore.scrubDataDirectories(Table.SYSTEM_TABLE, cfm.cfName);

在方法的执行过程中,所做的工作是移除相应的ColumnFamily所在目录下的不必要的文件,所谓不必要的文件,包括以下这些:临时文件(temp files),孤儿(orphans,指缺少data file的),零长度文件(zero-length files),已合并sstable(compacted sstables)。至于无法辨认的文件,将会被略过。经过这个方法之后,一系列满足上述描述的Descriptor将会被移除。

在具体的实现中,首先处理掉的是各个temp files和compacted sstables,因为有标记,所以很好清除;其次,是orphans,需要检查data file的标记;再其次,是未完成的cache文件,需要对文件名进行匹配,然后删掉;最后是清理这个ColumnFamily的一级索引,具体的操作是对一个ColumnFamily中的各个Column的Index做一次scrubDataDirectories递归调用,这里有趣的是,在原本该填入column family name的地方,填入的是针对某个column的indexName,虽然我还没有看到那部分的代码,但是从这里可以推测,Cassandra中主索引的实现,是以特殊的ColumnFamily的方式实现的。

9、Table的初始化

全名是org.apache.cassandra.db.Table。这个类的初始化自然也是在第一次调用到它的时候才会发生,有趣的是,这个类的第一次调用发生的位置是有可能不确定的。

第一处出现的地方是在SystemTable的checkHealth()方法中:

            table = Table.open(Table.SYSTEM_TABLE);

checkHealth()会在AbstractCassandraDaemon的setup()方法中被调用,这是真正的起始调用处。所以调用过程其实是这样:AbstractCassandraDaemon.setup()->SystemTable.checkHealth()->Table.open(Table.SYSTEM_TABLE)

第二处出现的地方是在ColumnFamilyStore的all()方法中:

        for (Table table : Table.all())

真正的起始调用处是在MeteredFlusher中的run()方法,调用过程是这样:MeteredFlusher.run()->MeteredFlusher.countFlushingBytes()->ColumnFamilyStore.all()->Table.all()->Table.open(tableName)

由于这两个调用的地方处于不同的线程之中,所以谁先被执行到是不确定的,然而,重要的地方是Table这个类的很多方法是需要在不同线程中进行同步的。也就是说,这些方法同一时间只能有一个线程去访问。open()方法就是其中一个。

有点扯远了,不过,明白Table的用处对于理解它的初始化是有帮助的。这里的static块中的代码没有什么特点,仅仅是检查相关的keyspace的目录有没有建立好,没建立好就建一下。更加重要的是Table中各种变量的存在意义。

如同我之前的文章所说,keyspace其实就是table,在代码中也得到了体现。首先,是2个static的变量,一个是重用读写锁:

    /**
     * accesses to CFS.memtable should acquire this for thread safety.
     * Table.maybeSwitchMemtable should aquire the writeLock; see that method for the full explanation.
     *
     * (Enabling fairness in the RRWL is observed to decrease throughput, so we leave it off.)
     */
    static final ReentrantReadWriteLock switchLock = new ReentrantReadWriteLock();

一个就是用来储存所有Table的实例映射的变量:

    /** Table objects, one per keyspace.  only one instance should ever exist for any given keyspace. */
    private static final Map<String, Table> instances = new NonBlockingHashMap<String, Table>();

其他的非static变量则是一个table应该拥有的属性:name,columnFamilyStores(每个column family都有一个columnFamilyStore),indexLocks,flushTask,replicationStrategy。

无论如何,Table.open()方法都是至关重要的一个入口。其实这个方法做的事情很简单,就是从instances中取得相应名字的table,如果找不到的话,就新建一个。需要注意的是必须保证每个keyspace只有一个table对象,所以在新建table的代码块外面加上了synchronized标记,用来在多线程中同步。

Table的构造器则做是在读取配置之后,建立各个实例属性(也就是上面提到的那些,name,columnFamilyStores,indexLocks,flushTask,replicationStrategy)。

10、系统表健康检查

说是健康检查,实际上也就是打开系统表的一个Column Family,比了比里面的值。

第一步打开系统表:

            table = Table.open(Table.SYSTEM_TABLE);

这一步的结果有两种:1)成功,继续下面的操作;2)发生异常,注释上说的是当更改了partitioner(从OPP改成RP)的时候会发生,系统表的文件还在,却无法读取。

第二步是读取系统表中的一部分(一行两列),如下面的子表(粗体是具体值,带引号的是字符串,不带的是变量,灰色是变量名,通常都是常量,蓝色是说明,下同):

column family: SCHEMA_CF=”Schema
columns: PARTITIONER=”Partioner CLUSTERNAME=”ClusterName
Key: LOCATION_KEY=”L  value  value

具体代码如下:

        SortedSet<ByteBuffer> cols = new TreeSet<ByteBuffer>(BytesType.instance);
        cols.add(PARTITIONER);
        cols.add(CLUSTERNAME);
        QueryFilter filter = QueryFilter.getNamesFilter(decorate(LOCATION_KEY), new QueryPath(STATUS_CF), cols);
        ColumnFamily cf = table.getColumnFamilyStore(STATUS_CF).getColumnFamily(filter);

此处的情况就略为复杂一些,可能发生的结果有三种:

  • 1)cf不为空,读取成功,great;
  • 2)cf为空,于是再次尝试取得名为 STATUS_CF 的系统表中的ColumnFamily,检查其中的SSTable部分:
    • 2.1)如果SSTable部分不为空,那就抛出异常,因为这意味着这个CF的文件存在,但是却读不出来,这种情况是在更改了partitioner(从RP改成了OPP)的时候会发生。如果为空,则进入2.2;
    • 2.2)如果SSTable部分为空,意味着没有找到系统表的这部分文件,没关系,这会被认为是新的节点,于是进入处理新节点的流程;

此外,这里的QueryFilter对象的建立,ColumnFamily对象的获得,都是一系列比较复杂的过程,这也是后话。

第三步是检查读取到的partitioner和clustername是否与从配置文件中读取的相同,我们需要他们相同!

11、载入Schema

第一步是尝试获得最新的MigrationId。获取的位置依然是System Table,存入UUID类型的变量中,这里假设这个变量是version吧,version下面还会用到,讲起来比较方便。具体位置如下表:

column family: SCHEMA_CF=”Schema
columns: LAST_MIGRATION_KEY=”Last Migration
Key: LAST_MIGRATION_KEY=”Last Migration  value

第二步,判断这个值是否找到,这时,会发生2种情况:1,找不到,表现为version == null。然后此时再查看是否有数据文件存在,如果数据文件还在,那么就是系统无法读取schema,需要用户用CLI重新定义schema,如果没有数据文件存在,那么很好,说明是一个什么表都没建立的空节点;2,找到了这个版本号,那么就会调用org.apache.cassandra.db.DefsTable.loadFromStorage(UUID version)方法。

第三步,接上一步的最后一个分支,是从存储中载入某个版本的keyspace。具体来说,是从System Table的下面DEFINITION_SCHEMA_COLUMN_NAME列中读出一个json串

column family: SCHEMA_CF=”Schema
columns: DEFINITION_SCHEMA_COLUMN_NAME=”Avro/Schema 每个keyspace一个column,name这里是keyspace的name,直接是文本
Key: version  value是Schema的JSON 每个keyspace的定义, 根据左边的schema编码之后的数据,读出后需要用相应的schema解码

然后依次经过几个类型的容器,IColumn=>ByteBuffer=>String=>Schema,获得一个Schema对象。最终通过反序列化得到所有keyspace的元数据的集合Collection<KSMetaData>

  • IColumn=>ByteBuffer:直接取IColumn的value
  • ByteBuffer=>String:调用ByteBufferUtil.string(value),获得一个json字符串
  • String=>Schema:调用Schema.parse(s),这个方法来自Avro项目,具体的过程不在次讨论,简单介绍一下Avro中的Schema支持,以下翻译自(http://www.cloudera.com/blog/2009/11/avro-a-new-format-for-data-interchange/):“

    Avro使用JSON来定义一个数据结构的schema。举例来说,一个二维的点可以定义成以下的Avro记录:

    {“type”: “record”, “name”: “Point”,
    “fields”: [
    {“name”: “x”, “type”: “int”},
    {“name”: “y”, “type”: “int”},
    ]
    }

    这个记录的每个实例都被序列化成简单的两个整数,没有额外的每记录(per-record)或每域(per-field)的注解。整数使用可变长的zig-zag编码写下。因此,较小坐标值的点就能仅用两个字节来写下:100个点会需要大概200字节。

    在记录(records)类型和数值(numeric)类型之外,Avro还包括了对数组(arrays),映射(maps),枚举(enums),变量(variable),定长二进制数据(fixed-length binary data)以及字符串 (strings)的支持。Avro还定义了一个容器文件格式(container file format),以提供良好的支持给MapReduce以及其他分析框架。细节见Avro specification.

  • Collection<KSMetaData>:使用第三步开始得到的Schema对象依次反序列化这个ColumnFamily中的其他Column的值,每次生成一个KsDef类的对象,合起来就是所有的keyspace的元数据集合。这个Schema相当于是keyspace元数据的元数据。

第四步,对所有的keyspace:创建keyspace名到columnfamily名的映射<cfm.ksName, cfm.cfName>,然后添加到CFMetaData类的静态变量cfIdMap中;然后将这个keyspace的元数据,连同版本号一起添加到DatabaseDescriptor的相应静态变量中。

12、清洗所有表的目录

类似上述第8步,只是清洗的对象是所有的表,因为他们的元数据已经由上一步获得了,存在了DatabaseDescriptor.tables里。

13、打开所有表

对所有的表执行Table.open(table)方法。代码如下:

        // initialize keyspaces
        for (String table : DatabaseDescriptor.getTables())
        {
            if (logger.isDebugEnabled())
                logger.debug("opening keyspace " + table);
            Table.open(table);
        }

14、启动垃圾收集检查器的定期运作

这里的垃圾收集检查器是指GCInspector类,这个类也用到了Singleton Pattern,只被实例化一次。但是注意这一点:这个类的作用是定期对sun的类进行垃圾收集。实例化时,会在MBeanServer中注册。然后会启动一个线程定期运作。定期运作时,如果当完成一次完整的垃圾收集后仍然使用很多的内存,则会根据需要进行1)降低缓存大小;2)flush最大的Memtable。

代码就一句:

        try
        {
            GCInspector.instance.start();
        }
        catch (Throwable t)
        {
            logger.warn("Unable to start GCInspector (currently only supported on the Sun JVM)");
        }

15、CommitLog恢复

CommitLog的恢复是在server端启动时完成的,作用是处理未完成的操作。

第一步,从DatabaseDescriptor中获得commitlog的位置,然后检查大小,看是否需要恢复(空文件不需要恢复)。如果需要,那么对需要恢复的commitlog进行排序(会有多个commitlog文件),排序依据是修改时间

第二步,使用这些commitlog文件进行恢复。这一步是CommitLog恢复的主要操作所在,过程还是比较复杂,而且似乎不同版本间的这个方法是不一样的,这里的仅对0.8.0。

  1. 获得所有sstable文件的ReplayPosition。
    先来说说ReplayPosition,这个类有用的是两个属性:segment和position。segment在程序里表现为时间点的值,用这个语句生成:System.currentTimeMillis(),segment会在2个地方出现(至少到目前为止,我只看到两个地方),一个是commitlog的文件名中,另一个是sstable4元组(一个SSTable包括Data.db,Filter.db,Index.db和Statistics.db)中的Statistics.db。而position则是文件中的文件指针的位置,用来标记从文件的何处开始读。于是,这里要做的就是从每个ColumnFamily的各个SSTable的Statistics.db文件中读出每个SSTable的ReplayPosition,然后取每个ColumnFamily的各个SSTable的ReplayPosition中的最大值,存入cfposition变量中。具体的文件格式在以后的文章中再说。
  2. 获取cfposition中最小的那个ReplayPosition,存入globalPosition。这里的最小指的是segment最小,也就是时间最早,代表着上一次系统结束前最早操作的那个column family的时间点。这个时间点后面会用到。
  3. 获取commitlog中最小的那个的segment。正如上面所说,commitlog的文件名中就包含了这个segment,所以直接解析即可。
  4. 对每个commitlog,获得执行恢复操作所需要的commitlog文件的位置。判断的依据是比较globalPosition.segment和commitlog.segment(代码中不是这样写的,但是这里这样写方便叙述,包括下面提到的replayPosition为了方便叙述也写成commitlog.replayPosition)的值,会发生三种情况:
    1)globalPosition.segment<commitlog.segment:说明这个commitlog所做的操作还没有执行,所以把commitlog.replayPosition值设为0
    2)globalPosition.segment==commitlog.segment:说明这个commitlog所做的操作正杂执行,所以把commitlog.replayPosition值设为globalPosition.position
    3)globalPosition.segment>commitlog.segment:说明这个commitlog所做的操作已经执行过了,所以把commitlog.replayPosition值设为reader.length(),也就是这个commitlog的末尾。
  5. 如果要恢复,从replayPosition每次读取一项(entry),然后将读取到的每一项反序列化成RowMutation对象
  6. 将得到的RowMutation对象添加到SEDA中MUTATION阶段的线程池中,执行提交。
    futures.add(StageManager.getStage(Stage.MUTATION).submit(runnable));
  7. 最后等所有的更改结束之后flush涉及到的table。

第三步,删除所有的commitlog。

16、启动服务器

第一步,获得各节点的Token,即一系列<token,endpoint>,获取地址在系统表中,具体如下:

column family: STATUS_CF=”LocationInfo
columns: column name是各token值
Key: RING_KEY=”Ring column value是相应的endpoint值

第二步,更新系统中使用到token的地方,一处时tokenMetadata_变量,一处是Gossiper。

第三步,定义了一个ShutdownHook,添加到了Runtime中。

第四步,加入到Token Ring中。从这里开始是真正开始把服务器当做网络中的一个节点,所以各种设置监听。

首先就是启动Gossiper。Gossiper作为Cassandra中很有特色的一个东西,还是比较重要的,其作用是维护集群的状态,通过gossip,每个节点都能知道集群中包含哪些节点,以及这些节点的状态,这使得Cassandra集群中的任何一个节点都可以完成任意key的路由,任意一个节点不可用都不会造成灾难性的后果。Gossiper的详细介绍以后再说,这里仅给出涉及到的部分。

  1. Gossiper采用 Singleton Pattern,仅有一个实例,在实例化的时候,做了两件事:设置了两个时间长度,用作gossip过程中的时间上限,然后将这个Gossiper实例注册为FailureDetector的一个监听器。Gossiper和FailureDetector的羁绊在于IFailureDetectionEventListener接口,其中只有一个方法 convict(InetAddress ep),即标记一个节点死了,FailureDetector实现了”The Phi Accrual Failure Detector”,做出判决,然后由各监听器来执行。
  2. 在启动Gossiper之前,注册了2个预订用户(subscribers):StorageService实例和MigrationManager实例,他们的共同特点是要实现IEndpointStateChangeSubscriber接口,这个接口定义了5个方法,用于某节点a通知它感兴趣的那部分(?interested parties,不知道怎么翻。。)关于任意endpoint的状态改变。这个5个方法分别是:onJoin,onChange,onAlive,onDead,onRemove。
            // have to start the gossip service before we can see any info on other nodes.  this is necessary
            // for bootstrap to get the load info it needs.
            // (we won't be part of the storage ring though until we add a nodeId to our state, below.)
            Gossiper.instance.register(this);
            Gossiper.instance.register(migrationManager);
            Gossiper.instance.start(SystemTable.incrementAndGetGeneration()); // needed for node-ring gathering.

    然后,Gossiper启动,参数是新的版本号。在启动过程中,首先获得所有的seed节点。然后启动本机的心跳状态(HeartBeatState),将本机状态设为alive,并将<netAddress, EndpointState>映射存入endpointStateMap。接下来通知snitches,告诉他们gossip要开始了。最后启动定期运作的线程,定期GossipTask类中的run方法,运作的内容如下(摘自袁大星的文档):

    Cassandra内部有一个Gossiper,每隔一秒运行一次(在Gossiper.java的start方法中),按照以下规则向其他节点发 送同步消息:

    1) 随机取一个当前活着的节点,并向它发送同步请求

    2) 向随机一台不可达的机器发送同步请求

    3) 如果第一步中所选择的节点不是seed,或者当前活着的节点数少于seed数,则向随意一台seed发送同步请求

    第一和第二步好理解,通过第一步可以和当前活着的节点同步状态,以更新本地的状态,通过第二步可以尽早发现不可用的节点重新可用了。

    第三步中的第一个条件,如果第一步中的节点不是seed,则向随意一台seed发送同步请求也比较好理解,因为seed理论上总是有较多的节点状态 信息。

    第三步中第二个条件则有点难理解,当活着的节点数少于seed时,也需要向随机的seed发送同步消息。其实这里是为了避免出现seed孤岛。

    如果没有这个判断,考虑这样一种场景,有4台机器,{A, B, C, D},并且配置了它们都是seed,如果它们同时启动,可能会出现这样的情形:

    A节点起来,发现没有活着的节点,走到第三步,和任意一个种子同步,假设选择了B

    B节点和A完成同步,则认为A活着,它将和A同步,由于A是种子,B将不再和其他种子同步

    C节点起来,发现没有活着的节点,同样走到第三步,和任意一个种子同步,假设这次选择了D

    C节点和D完成同步,认为D活着,则它将和D同步,由于D也是种子,所以C也不再和其他种子同步

    这时就形成了两个孤岛,A和B互相同步,C和D之间互相同步,但是{A,B}和{C,D}之间将不再互相同步,它们也就不知道对方的存在了。

    加入第二个判断后,A和B同步完,发现只有一个节点活着,但是seed有4个,这时会再和任意一个seed通信,从而打破这个孤岛。

  3. MessagingService开始监听本地地址。这是用于消息传递的类。
  4. StorageLoadBalancer开始广播。这个类是这篇文章《Scalable range query processing for large-scale distributed database applications》的实现。用于监视负载信息,必要时会进行负载平衡,运行间隔是5分钟。
  5. 向所有seed节点声明自己的版本号。先用消息进行积极(actively)的声明,最后使用Gossip进行消极(passively)的声明。
  6. 在Gossiper的本地映射中添加应用状态。
            MessagingService.instance().listen(FBUtilities.getLocalAddress());
            StorageLoadBalancer.instance.startBroadcasting();
            MigrationManager.announce(DatabaseDescriptor.getDefsVersion(), DatabaseDescriptor.getSeeds());
            Gossiper.instance.addLocalApplicationState(ApplicationState.RELEASE_VERSION, valueFactory.releaseVersion());
  7. HintedHandOffManager注册Mbean。
  8. 判断是不是AutoBootstrap,然后分别有不同的处理流程。如果不是AutoBootstrap的话,那么就设置一下token就好了。如果是AutoBootstrap的话,事情就大条了,先要用StorageLoadBalancer获得负载信息,然后看看这个节点是不是已经是token ring中一部分,是的话就异常了,有幸进行下去的话,用获得的负载信息来决定本机的token值,然后再用这个token启动。

17、载入mx4j

这是为了能够使用JMX。实现的代码里充斥着reflection。。。

三、start

启动RPCServer。最终是调用了这个方法:

    protected void startServer()
    {
        if (server == null)
        {
            server = new ThriftServer(listenAddr, listenPort);
            server.start();
        }
    }

四、后记

在拖延症的影响下,这篇日志写了2个多月,真是汗颜啊。。。当然,也有第一次看这个代码的原因,往往一个方法要搞懂得看好久,不过,完成了这篇日志,还是很有好处的,了解了许多实现内部的细节,以后应该就能看的快些了。最后一些部分有些流水账,写的太急。还有一些方法的细节部分没有写,准备以后专门用别的日志来写,这篇已经太长太长了。。。

Advertisements

google系列的ipv6地址

拷下来当备忘,以后换系统的时候方便找

#google plus
2404:6800:8005::71 profiles.google.com
2404:6800:8005::65 plusone.google.com
2404:6800:8005::8a plus.google.com
2404:6800:8005::62 talkgadget.google.com
#以下是图片服务器的国内ip,防止用了ipv6之后图片刷不出来
203.208.46.29 picadaweb.google.com
203.208.46.29 lh1.ggpht.com
203.208.46.29 lh2.ggpht.com
203.208.46.29 lh3.ggpht.com
203.208.46.29 lh4.ggpht.com
203.208.46.29 lh5.ggpht.com
203.208.46.29 lh6.ggpht.com
203.208.46.29 lh6.googleusercontent.com
203.208.46.29 lh5.googleusercontent.com
203.208.46.29 lh4.googleusercontent.com
203.208.46.29 lh3.googleusercontent.com
203.208.46.29 lh2.googleusercontent.com
203.208.46.29 lh1.googleusercontent.com
203.208.46.29 plus.google.com
203.208.46.29 talkgadget.google.com

##Google.com Google.com
2404:6800:8005::68 http://www.google.com #主页
2404:6800:8005::c1 m.google.com #Google移动版
2404:6800:8005::54 accounts.google.com #帐户
2404:6800:8005::65 services.google.com #服务申请
2404:6800:8005::65 goto.google.com #跳转
2404:6800:8005::d2 jmt0.google.com
2404:6800:8005::d2 wire.l.google.com

##Google.com.hk 谷歌香港
2404:6800:8005::2e http://www.google.com.hk
2404:6800:8005::2e images.google.com.hk
2404:6800:8005::2e video.google.com.hk
2404:6800:8005::2e maps.google.com.hk
2404:6800:8005::2e news.google.com.hk
2404:6800:8005::2e translate.google.com.hk
2404:6800:8005::2e blogsearch.google.com.hk
2404:6800:8005::2e picasaweb.google.com.hk
2404:6800:8005::2e toolbar.google.com.hk
2404:6800:8005::2e desktop.google.com.hk
2404:6800:8005::2e id.google.com.hk
2404:6800:8005::2e wenda.google.com.hk
2404:6800:8005::67 http://www.googlechinawebmaster.com

##Google.cn 谷歌中国(启用此地址无法正常使用谷歌音乐)
2401:3800:c001::68 translate.google.cn #翻译
2401:3800:c001::68 blogsearch.google.cn #博客搜索
2401:3800:c001::68 pack.google.cn #软件精选(跳转)
2401:3800:c001::68 news.google.cn #新闻(跳转)
2401:3800:c001::68 video.google.cn #视频(跳转)
2404:6800:8005::84 music.googleusercontent.cn

##Google.com.tw Google台湾
2404:6800:8005::2f http://www.google.com.tw #主页
2404:6800:8005::2f picasaweb.google.com.tw #picasaweb

##Google.co.jp Google日本
2a00:1450:8006::30 http://www.google.co.jp

#IPv6:ipv6.google.co.jp
2404:6800:8005::20 http://www.google.com.tr #土耳其
2404:6800:8005::21 http://www.google.com.au #澳大利亚
2404:6800:8005::22 http://www.google.com.vn #越南
2404:6800:8005::23 http://www.google.com.pk #巴基斯坦
2404:6800:8005::24 http://www.google.com.my #马来西亚
2404:6800:8005::25 http://www.google.com.pe
2404:6800:8005::26 http://www.google.co.za
2404:6800:8005::27 http://www.google.co.ve
2404:6800:8005::28 http://www.google.com.ph
2404:6800:8005::29 http://www.google.com.ar
2404:6800:8005::2a http://www.google.co.nz
2404:6800:8005::2b http://www.google.lt
2404:6800:8005::2d http://www.google.com.sg #新加坡
2404:6800:8005::2e http://www.google.com.hk #香港
2404:6800:8005::2f http://www.google.com.tw #台湾
2404:6800:8005::30 http://www.google.co.jp #日本
2404:6800:8005::31 http://www.google.ae
2404:6800:8005::32 http://www.google.co.uk #英国
2404:6800:8005::33 http://www.google.com.gr
2404:6800:8005::34 http://www.google.de
2404:6800:8005::35 http://www.google.co.il
2404:6800:8005::36 http://www.google.fr #法国
2404:6800:8005::38 http://www.google.it #意大利
2404:6800:8005::39 http://www.google.lv
2404:6800:8005::3a http://www.google.ca
2404:6800:8005::3b http://www.google.pl
2404:6800:8005::3c http://www.google.ch
2404:6800:8005::3d http://www.google.ro
2404:6800:8005::3e http://www.google.nl #荷兰
2404:6800:8005::3f http://www.google.com.ru #俄罗斯
2404:6800:8005::40 http://www.google.at #奥地利
2404:6800:8005::42 http://www.google.be
2404:6800:8005::44 http://www.google.co.kr #南韩
2404:6800:8005::45 http://www.google.com.ua
2404:6800:8005::48 http://www.google.fi #芬兰
2404:6800:8005::49 http://www.google.co.in
2404:6800:8005::4a http://www.google.pt
2404:6800:8005::4b http://www.google.com.ly
2404:6800:8005::4c http://www.google.com.br

#Web 网页
2404:6800:8005::68 http://www.google.com #主页
2404:6800:8005::68 encrypted.google.com #主页
2404:6800:8005::68 http://www.l.google.com
2404:6800:8005::62 www0.l.google.com
2404:6800:8005::62 www1.l.google.com
2404:6800:8005::62 www3.l.google.com
2404:6800:8005::62 suggestqueries.google.com #搜索建议
2404:6800:8005::62 suggestqueries.l.google.com #搜索建议
2404:6800:8005::62 clients0.google.com #客户端服务器
2404:6800:8005::62 clients1.google.com #客户端服务器
2404:6800:8005::62 clients2.google.com #客户端服务器
2404:6800:8005::62 clients3.google.com #客户端服务器
2404:6800:8005::62 clients4.google.com #客户端服务器
2404:6800:8005::62 clients.l.google.com
2404:6800:8005::62 clients1.google.com.hk # .com.hk 搜索建议
2404:6800:8005::62 clients-china.l.google.com
2404:6800:8005::62 linkhelp.clients.google.com #

#Images 图片
2404:6800:8005::68 images.google.com #主页
2404:6800:8005::68 images.l.google.com #
2404:6800:8005::62 tbn0.google.com
2404:6800:8005::62 tbn1.google.com
2404:6800:8005::62 tbn2.google.com
2404:6800:8005::62 tbn3.google.com
2404:6800:8005::62 tbn4.google.com
2404:6800:8005::62 tbn5.google.com
2404:6800:8005::62 tbn6.google.com

#Video 视频
2404:6800:8005::62 video.google.com #主页
2404:6800:8005::62 0.gvt0.com
2404:6800:8005::62 1.gvt0.com
2404:6800:8005::62 2.gvt0.com
2404:6800:8005::62 3.gvt0.com
2404:6800:8005::62 4.gvt0.com
2404:6800:8005::62 5.gvt0.com
2404:6800:8005::62 video-stats.video.google.com
2404:6800:8005::74 upload.video.google.com
2404:6800:8005::74 sslvideo-upload.l.google.com
2404:6800:8005::62 vp.video.google.com
2404:6800:8005::62 vp.video.l.google.com
2404:6800:8005::62 qwqy.vp.video.l.google.com
2404:6800:8005::62 nz.vp.video.l.google.com
2404:6800:8005::62 nztdug.vp.video.l.google.com
2404:6800:8005::62 pr.vp.video.l.google.com
2404:6800:8005::62 ug.vp.video.l.google.com
2404:6800:8005::62 vp01.video.l.google.com
2404:6800:8005::62 vp02.video.l.google.com
2404:6800:8005::62 vp03.video.l.google.com
2404:6800:8005::62 vp04.video.l.google.com
2404:6800:8005::62 vp05.video.l.google.com
2404:6800:8005::62 vp06.video.l.google.com
2404:6800:8005::62 vp07.video.l.google.com
2404:6800:8005::62 vp08.video.l.google.com
2404:6800:8005::62 vp09.video.l.google.com
2404:6800:8005::62 vp10.video.l.google.com
2404:6800:8005::62 vp11.video.l.google.com
2404:6800:8005::62 vp12.video.l.google.com
2404:6800:8005::62 vp13.video.l.google.com
2404:6800:8005::62 vp14.video.l.google.com
2404:6800:8005::62 vp15.video.l.google.com
2404:6800:8005::62 vp16.video.l.google.com
2404:6800:8005::62 vp17.video.l.google.com
2404:6800:8005::62 vp18.video.l.google.com
2404:6800:8005::62 vp19.video.l.google.com
2404:6800:8005::62 vp20.video.l.google.com

2401:3800:c001::68 0.gvt0.cn
2401:3800:c001::68 1.gvt0.cn
2401:3800:c001::68 2.gvt0.cn
2401:3800:c001::68 3.gvt0.cn

#Map 地图
2404:6800:8005::68 maps.google.com #主页
2404:6800:8005::68 local.google.com
2404:6800:8005::68 ditu.google.com #中国版(镜像)
2404:6800:8005::68 maps.l.google.com
2404:6800:8005::62 maps-api-ssl.google.com
2404:6800:8005::62 map.google.com
2404:6800:8005::62 kh.google.com
2404:6800:8005::62 kh.l.google.com
2404:6800:8005::62 khmdb.google.com
2404:6800:8005::62 khm.google.com #
2404:6800:8005::62 khm.l.google.com
2404:6800:8005::62 khm0.google.com #Satellite View
2404:6800:8005::62 khm1.google.com #Satellite View
2404:6800:8005::62 khm2.google.com #Satellite View
2404:6800:8005::62 khm3.google.com #Satellite View
2404:6800:8005::62 cbk0.google.com #Street View
2404:6800:8005::62 cbk1.google.com #Street View
2404:6800:8005::62 cbk2.google.com #Street View
2404:6800:8005::62 cbk3.google.com #Street View
2404:6800:8005::62 mw0.google.com
2404:6800:8005::62 mw1.google.com
2404:6800:8005::62 mw2.google.com
2404:6800:8005::62 mw3.google.com
2404:6800:8005::62 mw-small.l.google.com
2404:6800:8005::62 mt.l.google.com
2404:6800:8005::62 mt0.google.com
2404:6800:8005::62 mt1.google.com
2404:6800:8005::62 mt2.google.com
2404:6800:8005::62 mt3.google.com
2404:6800:8005::62 mlt0.google.com
2404:6800:8005::62 mlt1.google.com
2404:6800:8005::62 mlt2.google.com
2404:6800:8005::62 mlt3.google.com
2404:6800:8005::62 gg.google.com
2404:6800:8005::62 csi.l.google.com
2404:6800:8005::62 id.google.com
2404:6800:8005::62 id.l.google.com
2401:3800:c001::68 id.google.cn
2401:3800:c001::68 ditu.google.cn
2401:3800:c001::68 mt0.google.cn
2401:3800:c001::68 mt1.google.cn
2401:3800:c001::68 mt2.google.cn
2401:3800:c001::68 mt3.google.cn
2401:3800:c001::68 maps.gstatic.cn

#News 资讯
2404:6800:8005::68 news.google.com #主页
2404:6800:8005::68 news.l.google.com
2404:6800:8005::62 nt0.ggpht.com
2404:6800:8005::62 nt1.ggpht.com
2404:6800:8005::62 nt2.ggpht.com
2404:6800:8005::62 nt3.ggpht.com
2404:6800:8005::62 nt4.ggpht.com
2404:6800:8005::62 nt5.ggpht.com

#Gmail 邮箱
2404:6800:8005::11 mail.google.com #主页
2404:6800:8005::53 googlemail.l.google.com
2404:6800:8005::11 googlemail.l.google.com
2404:6800:8005::12 googlemail.l.google.com
2404:6800:8005::13 googlemail.l.google.com
2404:6800:8005::bd chatenabled.mail.google.com #Gmail中Gtalk聊天服务
2404:6800:8005::62 talk.gmail.com #Gmail中Gtalk聊天服务
2404:6800:8005::62 gmail.google.com #
2404:6800:8005::62 gmail.l.google.com #
2404:6800:8005::62 http://www.gmail.com #Gmail主页
2404:6800:8005::62 gmail.com #Gmail主页
2404:6800:8005::62 pop.gmail.com #pop服务
2404:6800:8005::62 smtp.gmail.com #smtp服务
2404:6800:8005::62 smtp1.google.com
2404:6800:8005::62 smtp2.google.com
2404:6800:8005::62 smtp3.google.com
2404:6800:8005::62 smtp4.google.com
2404:6800:8005::62 smtp5.google.com
2404:6800:8005::62 smtp-out.google.com
2404:6800:8005::62 smtp-out2.google.com
2404:6800:8005::62 smtp-out3.google.com
2404:6800:8005::62 imap.google.com #
2404:6800:8005::62 gmail-pop.l.google.com
2404:6800:8005::62 gmail-smtp.l.google.com
2404:6800:8005::62 gmail-smtp-in.l.google.com
2404:6800:8005::62 gmr-smtp-in.l.google.com

#Books 图书
2404:6800:8005::62 books.google.com #主页
2404:6800:8005::62 bks0.books.google.com
2404:6800:8005::62 bks1.books.google.com
2404:6800:8005::62 bks2.books.google.com
2404:6800:8005::62 bks3.books.google.com
2404:6800:8005::62 bks4.books.google.com
2404:6800:8005::62 bks5.books.google.com
2404:6800:8005::62 bks6.books.google.com
2404:6800:8005::62 bks7.books.google.com
2404:6800:8005::62 bks8.books.google.com
2404:6800:8005::62 bks9.books.google.com

#Finance 财经
2404:6800:8005::62 finance.google.com

#Translate 翻译
2404:6800:8005::62 translate.google.com
2401:3800:c001::68 translate.google.cn

#Trends 趋势
2404:6800:8005::63 trends.google.com

#Directory 网页目录
2404:6800:8005::8a directory.google.com
2404:6800:8005::8a dir.google.com #Google网页目录

#Blog 博客搜索
2404:6800:8005::63 blogsearch.google.com
2401:3800:c001::68 blogsearch.google.cn

#Calendar 日历
2404:6800:8005::64 calendar.google.com

#Photo/Picasa 照片/网络相册
2404:6800:8005::5d photos.google.com
2404:6800:8005::63 picasa.google.com
2404:6800:8005::be picasaweb.google.com
2404:6800:8005::62 lh0.ggpht.com
2404:6800:8005::62 lh1.ggpht.com
2404:6800:8005::62 lh2.ggpht.com
2404:6800:8005::62 lh3.ggpht.com
2404:6800:8005::62 lh4.ggpht.com
2404:6800:8005::62 lh5.ggpht.com
2404:6800:8005::62 lh6.ggpht.com
2404:6800:8005::62 lh7.ggpht.com
2404:6800:8005::62 lh8.ggpht.com
2404:6800:8005::62 lh9.ggpht.com
2404:6800:8005::62 lh6.google.com

#Docs 文档
2404:6800:8005::64 docs.google.com
2404:6800:8005::65 docs0.google.com
2404:6800:8005::65 docs1.google.com
2404:6800:8005::65 docs2.google.com
2404:6800:8005::65 docs3.google.com
2404:6800:8005::65 docs4.google.com
2404:6800:8005::65 docs5.google.com
2404:6800:8005::65 docs6.google.com
2404:6800:8005::65 docs7.google.com
2404:6800:8005::65 docs8.google.com
2404:6800:8005::65 docs9.google.com
2404:6800:8005::62 spreadsheet.google.com
2404:6800:8005::62 spreadsheets.google.com
2404:6800:8005::62 spreadsheets0.google.com
2404:6800:8005::62 spreadsheets1.google.com
2404:6800:8005::62 spreadsheets2.google.com
2404:6800:8005::62 spreadsheets3.google.com
2404:6800:8005::62 spreadsheets4.google.com
2404:6800:8005::62 spreadsheets5.google.com
2404:6800:8005::62 spreadsheets6.google.com
2404:6800:8005::62 spreadsheets7.google.com
2404:6800:8005::62 spreadsheets8.google.com
2404:6800:8005::62 spreadsheets9.google.com
2404:6800:8005::62 spreadsheets.l.google.com
2404:6800:8005::62 spreadsheets-china.l.google.com
2404:6800:8005::62 writely.google.com
2404:6800:8005::62 writely.l.google.com
2404:6800:8005::62 writely-com.l.google.com
2404:6800:8005::62 writely-china.l.google.com

#Reader 阅读器
2404:6800:8005::68 reader.google.com
2404:6800:8005::68 www2.l.google.com

#Group 论坛
2404:6800:8005::62 groups.google.com
2404:6800:8005::62 groups.l.google.com
2404:6800:8005::89 *.googlegroups.com
2404:6800:8005::89 blob-s-docs.googlegroups.com
2404:6800:8005::89 2503061233288453901-a-1802744773732722657-s-sites.googlegroups.com

#Scholar 学术搜索
2404:6800:8005::62 scholar.google.com
2404:6800:8005::62 scholar.l.google.com

#Tools 工具
2404:6800:8005::62 tools.google.com
2404:6800:8005::62 tools.l.google.com

#Code 代码
2404:6800:8005::64 code.google.com #主页
2404:6800:8005::64 code.l.google.com #
2404:6800:8005::52 *.googlecode.com #
2404:6800:8005::52 chromium.googlecode.com #
2404:6800:8005::52 searchforchrome.googlecode.com #
2404:6800:8005::52 android-scripting.googlecode.com #Android Scripting Environment
2404:6800:8005::52 earth-api-samples.googlecode.com #
2404:6800:8005::52 gmaps-samples-flash.googlecode.com #
2404:6800:8005::52 google-code-feed-gadget.googlecode.com
2404:6800:8005::52 china-addthis.googlecode.com #
2404:6800:8005::52 get-flash-videos.googlecode.com #get-flash-videos
2404:6800:8005::52 youplayer.googlecode.com #YouPlayer
2404:6800:8005::52 cclive.googlecode.com #ccLive

#Labs 实验室
2404:6800:8005::65 labs.google.com
2404:6800:8005::62 http://www.googlelabs.com
2404:6800:8005::62 browsersize.googlelabs.com #Browser Size
2404:6800:8005::62 citytours.googlelabs.com #City Tours
2404:6800:8005::62 fastflip.googlelabs.com #Fast Flip
2404:6800:8005::62 followfinder.googlelabs.com #Follow Finder
2404:6800:8005::62 image-swirl.googlelabs.com #Image Swirl
2404:6800:8005::62 listen.googlelabs.com #Google Listen
2404:6800:8005::62 livingstories.googlelabs.com #Living Stories
2404:6800:8005::62 newstimeline.googlelabs.com #Google News Timeline
2404:6800:8005::62 relatedlinks.googlelabs.com #Related Links
2404:6800:8005::62 scriptconv.googlelabs.com #Script Converter
2404:6800:8005::62 similar-images.googlelabs.com #Similar Images
2404:6800:8005::62 storegadget.googlelabs.com #Google Checkout Store Gadget
2404:6800:8005::62 tables.googlelabs.com #Fusion Tables
2404:6800:8005::62 appspot.l.google.com

#Knol 在线百科全书
2404:6800:8005::65 knol.google.com

#SketchUp 3D建模工具
2404:6800:8005::62 sketchup.google.com

#Pack 软件精选
2404:6800:8005::68 pack.google.com
2404:6800:8005::68 cache.pack.google.com
2401:3800:c001::68 pack.google.cn

#Blogger 博客服务
2404:6800:8005::bf http://www.blogger.com
2404:6800:8005::bf blogger.com
2404:6800:8005::bf buttons.blogger.com
2404:6800:8005::bf beta.blogger.com
2404:6800:8005::bf draft.blogger.com #Blogger 测试区
2404:6800:8005::bf status.blogger.com #Blogger 状态
2404:6800:8005::bf help.blogger.com #支持中心
2404:6800:8005::bf buzz.blogger.com #Blogger Buzz博客(英文)
2404:6800:8005::bf photos1.blogger.com
2404:6800:8005::bf bp0.blogger.com
2404:6800:8005::bf bp1.blogger.com
2404:6800:8005::bf bp2.blogger.com
2404:6800:8005::bf bloggerphotos.l.google.com
2404:6800:8005::62 blogger.google.com
2404:6800:8005::62 www2.blogger.com
2404:6800:8005::62 blogger.l.google.com
2404:6800:8005::62 http://www.blogblog.com
2404:6800:8005::62 www1.blogblog.com
2404:6800:8005::62 www2.blogblog.com
2404:6800:8005::62 img.blogblog.com
2404:6800:8005::62 img1.blogblog.com
2404:6800:8005::62 img2.blogblog.com
2404:6800:8005::62 img.blshe.com

#Blogspot 博客服务
2404:6800:8005::62 http://www.blogspot.com #主页
2404:6800:8005::62 blogsofnote.blogspot.com #留言博客(英文版本)
2404:6800:8005::62 knownissues.blogspot.com #已知问题
2404:6800:8005::62 1.bp.blogspot.com #
2404:6800:8005::62 2.bp.blogspot.com #
2404:6800:8005::62 3.bp.blogspot.com #
2404:6800:8005::62 4.bp.blogspot.com #
2404:6800:8005::62 bloggertemplatespreview.blogspot.com #模板编辑器的实时预览功能

#Google 官方博客群
2404:6800:8005::62 adwordsapi.blogspot.com
2404:6800:8005::62 adsense-zhs.blogspot.com
2404:6800:8005::62 android-developers.blogspot.com
2404:6800:8005::62 apacdeveloper.blogspot.com #Google Asia Pacific Developer Blog
2404:6800:8005::62 booksearch.blogspot.com #Inside Google Books
2404:6800:8005::62 chrome.blogspot.com
2404:6800:8005::62 doubleclickpublishersapi.blogspot.com
2404:6800:8005::62 emeadev.blogspot.com #Google Europe, Middle East & Africa Developer Blog
2404:6800:8005::62 gearsblog.blogspot.com
2404:6800:8005::62 google-code-featured.blogspot.com #Featured Projects on Google Code
2404:6800:8005::62 google-entertainment-it.blogspot.com
2404:6800:8005::62 google-opensource.blogspot.com
2404:6800:8005::62 googleajaxsearchapi.blogspot.com
2404:6800:8005::62 googleappengine.blogspot.com
2404:6800:8005::62 googleappsdeveloper.blogspot.com
2404:6800:8005::62 googleblog.blogspot.com #Official Google Blog
2404:6800:8005::62 googlecheckout.blogspot.com
2404:6800:8005::62 googlecheckoutapi.blogspot.com
2404:6800:8005::62 googlechinablog.blogspot.com
2404:6800:8005::62 googlechromereleases.blogspot.com #Google Chrome Releases
2404:6800:8005::62 googlecode.blogspot.com
2404:6800:8005::62 googlecustomsearch.blogspot.com #Google Custom Search Blog
2404:6800:8005::62 googleenterprise.blogspot.com
2404:6800:8005::62 googlegeodevelopers.blogspot.com #Google Geo Developers Blog
2404:6800:8005::62 googlemashupeditor.blogspot.com
2404:6800:8005::62 googlemobile.blogspot.com
2404:6800:8005::62 googleresearch.blogspot.com
2404:6800:8005::62 googletalk.blogspot.com
2404:6800:8005::62 googlewebmaster-cn.blogspot.com
2404:6800:8005::62 googlewebmastercentral.blogspot.com
2404:6800:8005::62 googlewebtoolkit.blogspot.com
2404:6800:8005::62 golangblog.blogspot.com
2404:6800:8005::62 gmailblog.blogspot.com
2404:6800:8005::62 igoogledeveloper.blogspot.com #iGoogle Developer Blog
2404:6800:8005::62 webmproject.blogspot.com
2404:6800:8005::62 youtube-global.blogspot.com #YouTube Blog
#BlogSpot 上的其他常用博客
2404:6800:8005::62 googlesystem.blogspot.com #Google Operating System
2404:6800:8005::62 chinafreenet.blogspot.com #中国自由网
2404:6800:8005::62 gregmankiw.blogspot.com #GREG MANKIW’S BLOG
2404:6800:8005::62 xiangeliushui.blogspot.com #年华似水,岁月如歌
2404:6800:8005::62 chinagfw.blogspot.com #GFW Blog
2404:6800:8005::62 wallpapers-arena.blogspot.com #Wallpapers Arena
2404:6800:8005::62 ggq.blogspot.com #GG圈
2404:6800:8005::62 whiteappleer.blogspot.com #WA+ER
2404:6800:8005::62 rain-reader.blogspot.com #Nostalgia: Those Who Remain
2404:6800:8005::62 unityteam1.blogspot.com #生活圈 BLOG
2404:6800:8005::62 ipv6-or-no-ipv6.blogspot.com #IPv6 Related Stuff
2404:6800:8005::62 gysj.blogspot.com #
2404:6800:8005::62 szncu.blogspot.com #
#2404:6800:8005::62 *.blogspot.com #可以添加你自己的博客地址到这里

#Checkout 买家
2404:6800:8005::73 checkout.google.com

#Orkut 网络社区(尚未部署至 ipv6)
2404:6800:8005::69 help.orkut.com
2404:6800:8005::62 officialorkutblog.blogspot.com
2404:6800:8003::79 blog.orkut.com
2404:6800:8003::79 en.blog.orkut.com

#Sites 协作平台
2404:6800:8005::65 sites.google.com
2404:6800:8005::62 gsamplemaps.googlepages.com

#Google Apps 企业应用套件
2404:6800:8005::62 apps.google.com #主页
2404:6800:8003::79 ghs.google.com
2404:6800:8003::79 ghs46.google.com #GHS 双栈入口!
2404:6800:8003::79 ghs.l.google.com
2404:6800:8003::79 ghs46.l.google.com
2404:6800:8003::79 blog.opensocial.org
2404:6800:8003::79 govecn.org
2404:6800:8003::79 http://www.govecn.org
2404:6800:8003::79 1984talk.org
2404:6800:8003::79 http://www.1984talk.org
#2404:6800:8003::79 ghs.google.com #可以添加你 GApps 域名的博客地址 / GSites 地址到这里

#Mashups/App Engine GAE
2404:6800:8005::67 googlemashups.com #Google Mashup Editor
2404:6800:8005::68 http://www.googlemashups.com
2404:6800:8005::62 googlemashups.l.google.com
2404:6800:8005::63 *.googlemashups.com
2404:6800:8005::64 appengine.google.com #主页
2404:6800:8005::62 appspot.l.google.com #
2404:6800:8005::8d *.appspot.com
2404:6800:8005::8d productideas.appspot.com #Google 汇问
2404:6800:8005::8d wave-api.appspot.com #Google Wave API
2404:6800:8005::8d wave-skynet.appspot.com #SkyNet
2404:6800:8005::8d cactus-wave.appspot.com #
2404:6800:8005::8d storegadgetwizard.appspot.com #Google Checkout Store Gadget
2404:6800:8005::8d moderator.appspot.com #Google Moderator
2404:6800:8005::8d haiticrisis.appspot.com #Google Person Finder: Haiti Earthquake
2404:6800:8005::8d mytracks.appspot.com #My Tracks for Android
2404:6800:8005::8d reader2twitter.appspot.com #Reader2Tweet
2404:6800:8005::8d twitese.appspot.com
2404:6800:8005::8d gfw.appspot.com
2404:6800:8005::8d go2china9.appspot.com
2404:6800:8005::8d mirrorrr.appspot.com
2404:6800:8005::8d mirrornt.appspot.com
2404:6800:8005::8d soproxy.appspot.com
2404:6800:8005::8d so-proxy.appspot.com
2404:6800:8005::8d go-west.appspot.com
2404:6800:8005::8d proxytea.appspot.com
2404:6800:8005::8d sivanproxy.appspot.com
2404:6800:8005::8d proxybay.appspot.com
2404:6800:8005::8d ipgoto.appspot.com
2404:6800:8005::8d meme2028.appspot.com
2404:6800:8005::8d autoproxy2pac.appspot.com

#Google APIs 开发接口服务
2404:6800:8005::62 chart.apis.google.com #Google 图表 API
2404:6800:8005::5f *.googleapis.com
2404:6800:8005::5f translate.googleapis.com #Google 翻译 API
2404:6800:8005::5f ajax.googleapis.com #Ajax API
2404:6800:8005::5f googleapis-ajax.google.com
2404:6800:8005::5f googleapis-ajax.l.google.com
2404:6800:8005::5f commondatastorage.googleapis.com #

#Google Hosted 托管服务
2404:6800:8005::84 http://www.googlehosted.com
2404:6800:8005::84 music.googleusercontent.com #音乐播放器 专辑封面 等
2404:6800:8005::84 googlehosted.l.google.com
2404:6800:8005::62 base.googlehosted.com
2404:6800:8005::62 base0.googlehosted.com
2404:6800:8005::62 base1.googlehosted.com
2404:6800:8005::62 base2.googlehosted.com
2404:6800:8005::62 base3.googlehosted.com
2404:6800:8005::62 base4.googlehosted.com
2404:6800:8005::62 base5.googlehosted.com

#GoogleUserContent 用户自定义的Google服务
2404:6800:8005::84 http://www.googleusercontent.com #
2404:6800:8005::84 clients1.googleusercontent.com #
2404:6800:8005::84 clients2.googleusercontent.com #
2404:6800:8005::84 webcache.googleusercontent.com #网页快照
2404:6800:8005::84 lh0.googleusercontent.com #
2404:6800:8005::84 lh1.googleusercontent.com #
2404:6800:8005::84 lh2.googleusercontent.com #
2404:6800:8005::84 lh3.googleusercontent.com #
2404:6800:8005::62 lh3.googleusercontent.com #谷歌音乐(大陆)
2404:6800:8005::62 lh4.googleusercontent.com #谷歌音乐(大陆)
2404:6800:8005::62 lh5.googleusercontent.com #谷歌音乐(大陆)
2404:6800:8005::62 lh6.googleusercontent.com #谷歌音乐(大陆)
2404:6800:8005::84 s2.googleusercontent.com #
2404:6800:8005::84 wave.googleusercontent.com #Wave
2404:6800:8005::84 blogger.googleusercontent.com #Blogger
2404:6800:8005::84 translate.googleusercontent.com #翻译
2404:6800:8005::84 music-onebox.googleusercontent.com #音乐歌曲CD封面图片
2404:6800:8005::84 spreadsheets-opensocial.googleusercontent.com #表格
2404:6800:8005::84 www-opensocial.googleusercontent.com #
2404:6800:8005::84 www-gm-opensocial.googleusercontent.com #Gmail?
2404:6800:8005::84 www-opensocial-sandbox.googleusercontent.com #SandBox
2404:6800:8005::84 www-open-opensocial.googleusercontent.com #
2404:6800:8005::84 1-open-opensocial.googleusercontent.com #
2404:6800:8005::84 www-focus-opensocial.googleusercontent.com #缩略图
2404:6800:8005::84 images0-focus-opensocial.googleusercontent.com #缩略图
2404:6800:8005::84 images1-focus-opensocial.googleusercontent.com #缩略图
2404:6800:8005::84 images2-focus-opensocial.googleusercontent.com #缩略图
2404:6800:8005::84 doc-00-7o-docs.googleusercontent.com #
2404:6800:8005::84 doc-08-7o-docs.googleusercontent.com #
2404:6800:8005::84 doc-10-7o-docs.googleusercontent.com #
2404:6800:8005::84 doc-14-7o-docs.googleusercontent.com #
2404:6800:8005::84 doc-0c-7o-docs.googleusercontent.com #
2404:6800:8005::84 doc-0g-7o-docs.googleusercontent.com #
2404:6800:8005::84 doc-0s-7o-docs.googleusercontent.com #
2404:6800:8005::84 www-focus-opensocial.googleusercontent.com #
2404:6800:8005::84 0-focus-opensocial.googleusercontent.com #
2404:6800:8005::84 1-focus-opensocial.googleusercontent.com #
2404:6800:8005::84 2-focus-opensocial.googleusercontent.com #
2404:6800:8005::84 3-focus-opensocial.googleusercontent.com #
2404:6800:8005::84 www-open-opensocial.googleusercontent.com #
2404:6800:8005::84 0-open-opensocial.googleusercontent.com #
2404:6800:8005::84 1-open-opensocial.googleusercontent.com #
2404:6800:8005::84 2-open-opensocial.googleusercontent.com #
2404:6800:8005::84 3-open-opensocial.googleusercontent.com #
2404:6800:8005::84 www-wave-opensocial.googleusercontent.com #Wave
2404:6800:8005::84 0-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 1-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 2-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 3-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 4-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 5-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 6-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 7-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 8-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 9-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 10-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 11-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 12-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 13-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 14-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 15-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 16-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 17-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 18-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 19-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 20-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 21-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 22-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 23-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 24-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 25-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 26-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 27-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 28-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 29-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 30-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 31-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 32-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 33-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 34-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 35-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 36-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 37-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 38-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 39-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 40-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 1927502848-wave-opensocial.googleusercontent.com #
2404:6800:8005::84 la5dhjn62ripv179lf7outfl68h6dc3c-a-wave-opensocial.googleusercontent.com
2404:6800:8005::84 3hdrrlnlknhi77nrmsjnjr152ueo3soc-a-calendar-opensocial.googleusercontent.com
2404:6800:8005::84 eds9earadhd329tuipi6kfc947ts928j-a-sites-opensocial.googleusercontent.com
2404:6800:8005::84 sp5ovcebgtpf6rg65f53gdnvqtt3a58n-a-sites-opensocial.googleusercontent.com

#Chrome 谷歌浏览器
2404:6800:8005::64 chrome.google.com
2404:6800:8005::65 browsersync.google.com
2404:6800:8005::65 browsersync.l.google.com
2404:6800:8005::63 toolbarqueries.google.com #PageRank 查询(工具栏显示)
2404:6800:8005::63 toolbarqueries.clients.google.com
2404:6800:8005::63 toolbarqueries.l.google.com

#Chromium Chromium 项目
2404:6800:8005::65 chromium.org #跳转至 www
2404:6800:8003::79 http://www.chromium.org
2404:6800:8003::79 dev.chromium.org
2404:6800:8003::79 blog.chromium.org
2404:6800:8005::62 build.chromium.org

#Chromium OS Chromium 操作系统
2404:6800:8003::79 goto.ext.google.com
2404:6800:8005::8d welcome-cros.appspot.com #Chromium 主菜单

#SafeBrowing 安全浏览
2404:6800:8005::62 sb.google.com #安全浏览检测 API
2001:4860:8010::88 sb.l.google.com
2404:6800:8005::62 sb-ssl.google.com
2001:4860:8010::be sb-ssl.google.com
2001:4860:8010::5b sb-ssl.google.com
2001:4860:8010::5d sb-ssl.google.com
2001:4860:8010::88 sb-ssl.google.com
2404:6800:8005::62 sb-ssl.l.google.com
2404:6800:8005::62 safebrowsing.clients.google.com #安全浏览警告页面
2404:6800:8005::62 safebrowsing-cache.google.com #安全浏览警告数据(分块加载)
2001:4860:4001:402::17 safebrowsing.cache.l.google.com

#Toolbar 工具栏
2404:6800:8005::62 toolbar.google.com

#Desktop 桌面
2404:6800:8005::62 desktop.google.com
2404:6800:8005::62 desktop.l.google.com

#Google Earth Google地球
2404:6800:8005::65 earth.google.com

#Google Mars Google火星地图
2404:6800:8005::65 mars.google.com

#Download 下载
2404:6800:8005::5b dl.google.com
2404:6800:8005::5d dl.l.google.com
2404:6800:8005::88 dl-ssl.google.com

#Sandbox 沙盒
2404:6800:8005::51 sandbox.google.com

#Wave 波浪
2404:6800:8005::76 wave.google.com
2404:6800:8005::76 www4.l.google.com
2404:6800:8005::76 wave0.google.com
2404:6800:8005::76 wave1.google.com
2404:6800:8005::62 googlewave.com

#WiFi
2404:6800:8005::7b wifi.google.com
2404:6800:8005::62 wifi.l.google.com

#GTalk 聊天
2404:6800:8005::62 talk.l.google.com
2404:6800:8005::62 default.talk.google.com
2404:6800:8005::62 talkgadget.google.com
2404:6800:8005::62 rtmp0.google.com
2404:6800:8005::62 users.talk.google.com

#Buzz
2404:6800:8005::62 buzz.google.com

#Answers/Guru/WenDa 问答社区(国际版已停止运营)
2404:6800:8005::66 answers.google.com
2404:6800:8005::62 guru.google.com
2404:6800:8005::62 guru.google.co.th #泰国
2404:6800:8005::2e wenda.google.com.hk

#Fusion RSS 聚合指南
2404:6800:8005::62 fusion.google.com

#iGoogle Modules iGoogle 小工具
2404:6800:8005::62 gmodules.com
2404:6800:8005::62 http://www.gmodules.com
2404:6800:8005::62 http://www.ig.gmodules.com
2404:6800:8005::62 ig.gmodules.com
2404:6800:8005::62 ads.gmodules.com
2404:6800:8005::62 p.gmodules.com
2404:6800:8005::62 1.ig.gmodules.com
2404:6800:8005::62 2.ig.gmodules.com
2404:6800:8005::62 3.ig.gmodules.com
2404:6800:8005::62 4.ig.gmodules.com
2404:6800:8005::62 5.ig.gmodules.com
2404:6800:8005::62 6.ig.gmodules.com
2404:6800:8005::62 maps.gmodules.com
2404:6800:8005::62 img0.gmodules.com
2404:6800:8005::62 img1.gmodules.com
2404:6800:8005::62 img2.gmodules.com
2404:6800:8005::62 img3.gmodules.com
2404:6800:8005::62 skins.gmodules.com
2404:6800:8005::62 friendconnect.gmodules.com
2404:6800:8005::62 mc8tdi0ripmbpds25eboaupdulritrp6.friendconnect.gmodules.com
2404:6800:8005::62 r1rk9np7bpcsfoeekl0khkd2juj27q3o.friendconnect.gmodules.com
2404:6800:8005::62 r1rk9np7bpcsfoeekl0khkd2juj27q3o.a.friendconnect.gmodules.com

#GStatic Google 静态文件存储
2404:6800:8005::62 http://www.gstatic.com
2404:6800:8005::62 csi.gstatic.com
2404:6800:8005::62 maps.gstatic.com
2404:6800:8005::78 ssl.gstatic.com
2404:6800:8005::62 t0.gstatic.com
2404:6800:8005::62 t1.gstatic.com
2404:6800:8005::62 t2.gstatic.com
2404:6800:8005::62 t3.gstatic.com
2404:6800:8005::62 t4.gstatic.com
2404:6800:8005::62 mt0.gstatic.com

##Google其他服务
#YouTube
2404:6800:8005::65 http://www.youtube.com
2404:6800:8005::65 http://www.youtube-nocookie.com
2404:6800:8005::64 youtube-ui-china.l.google.com
2404:6800:8005::64 m.youtube.com
2404:6800:8005::64 tw.youtube.com
2404:6800:8005::65 youtu.be
2404:6800:8005::64 gdata.youtube.com
2404:6800:8005::64 help.youtube.com
2404:6800:8005::64 upload.youtube.com
2404:6800:8005::64 insight.youtube.com
2404:6800:8005::64 img.youtube.com
2404:6800:8005::64 s2.youtube.com
2404:6800:8005::64 youtube.com
2404:6800:8003::79 apiblog.youtube.com #YouTube API 开发博客
2404:6800:8005::64 clients1.youtube.com
2001:4860:4001:402::15 static.cache.l.google.com
2404:6800:8005::76 ytimg.l.google.com
2404:6800:8005::76 i.ytimg.com
2404:6800:8005::76 i1.ytimg.com
2404:6800:8005::76 i2.ytimg.com
2404:6800:8005::76 i3.ytimg.com
2404:6800:8005::76 i4.ytimg.com
2404:6800:8005::76 d.yimg.com
2404:6800:8005::76 s.ytimg.com

2404:6800:4001::10 v1.lscache1.c.youtube.com
2404:6800:4001::10 v1.lscache2.c.youtube.com
2404:6800:4001::10 v1.lscache3.c.youtube.com
2404:6800:4001::10 v1.lscache4.c.youtube.com
2404:6800:4001:1::10 v1.lscache5.c.youtube.com
2404:6800:4001::10 v1.lscache6.c.youtube.com
2404:6800:4001::10 v1.lscache7.c.youtube.com
2404:6800:4001::10 v1.lscache8.c.youtube.com
2404:6800:4001:1::13 v2.lscache1.c.youtube.com
2404:6800:4001:1::13 v2.lscache2.c.youtube.com
2404:6800:4001::13 v2.lscache3.c.youtube.com
2404:6800:4001:1::13 v2.lscache4.c.youtube.com
2404:6800:4001::13 v2.lscache5.c.youtube.com
2404:6800:4001::13 v2.lscache6.c.youtube.com
2404:6800:4001::13 v2.lscache7.c.youtube.com
2404:6800:4001::13 v2.lscache8.c.youtube.com
2404:6800:4001::16 v3.lscache1.c.youtube.com
2404:6800:4001::16 v3.lscache2.c.youtube.com
2404:6800:4001::16 v3.lscache3.c.youtube.com
2404:6800:4001::16 v3.lscache4.c.youtube.com
2404:6800:4001::16 v3.lscache5.c.youtube.com
2404:6800:4001::16 v3.lscache6.c.youtube.com
2404:6800:4001:1::16 v3.lscache7.c.youtube.com
2404:6800:4001::16 v3.lscache8.c.youtube.com
2404:6800:4001::19 v4.lscache1.c.youtube.com
2404:6800:4001:1::19 v4.lscache2.c.youtube.com
2404:6800:4001::19 v4.lscache3.c.youtube.com
2404:6800:4001::19 v4.lscache4.c.youtube.com
2404:6800:4001::19 v4.lscache5.c.youtube.com
2404:6800:4001::19 v4.lscache6.c.youtube.com
2404:6800:4001::19 v4.lscache7.c.youtube.com
2404:6800:4001::19 v4.lscache8.c.youtube.com
2404:6800:4001::1c v5.lscache1.c.youtube.com
2404:6800:4001:1::1c v5.lscache2.c.youtube.com
2404:6800:4001::1c v5.lscache3.c.youtube.com
2404:6800:4001:1::1c v5.lscache4.c.youtube.com
2404:6800:4001::1c v5.lscache5.c.youtube.com
2404:6800:4001::1c v5.lscache6.c.youtube.com
2404:6800:4001::1c v5.lscache7.c.youtube.com
2404:6800:4001::1c v5.lscache8.c.youtube.com
2404:6800:4001::1f v6.lscache1.c.youtube.com
2404:6800:4001::1f v6.lscache2.c.youtube.com
2404:6800:4001::1f v6.lscache3.c.youtube.com
2404:6800:4001::1f v6.lscache4.c.youtube.com
2404:6800:4001::1f v6.lscache5.c.youtube.com
2404:6800:4001::1f v6.lscache6.c.youtube.com
2404:6800:4001::1f v6.lscache7.c.youtube.com
2404:6800:4001:1::1f v6.lscache8.c.youtube.com
2404:6800:4001::22 v7.lscache1.c.youtube.com
2404:6800:4001::22 v7.lscache2.c.youtube.com
2404:6800:4001::22 v7.lscache3.c.youtube.com
2404:6800:4001:1::22 v7.lscache4.c.youtube.com
2404:6800:4001::22 v7.lscache5.c.youtube.com
2404:6800:4001:1::22 v7.lscache6.c.youtube.com
2404:6800:4001::22 v7.lscache7.c.youtube.com
2404:6800:4001::22 v7.lscache8.c.youtube.com
2404:6800:4001::25 v8.lscache1.c.youtube.com
2404:6800:4001::25 v8.lscache2.c.youtube.com
2404:6800:4001:1::25 v8.lscache3.c.youtube.com
2404:6800:4001::25 v8.lscache4.c.youtube.com
2404:6800:4001:1::25 v8.lscache5.c.youtube.com
2404:6800:4001::25 v8.lscache6.c.youtube.com
2404:6800:4001::25 v8.lscache7.c.youtube.com
2404:6800:4001:1::25 v8.lscache8.c.youtube.com
2404:6800:4001::11 v9.lscache1.c.youtube.com
2404:6800:4001::11 v9.lscache2.c.youtube.com
2404:6800:4001::11 v9.lscache3.c.youtube.com
2404:6800:4001:1::11 v9.lscache4.c.youtube.com
2404:6800:4001:1::11 v9.lscache5.c.youtube.com
2404:6800:4001::11 v9.lscache6.c.youtube.com
2404:6800:4001::11 v9.lscache7.c.youtube.com
2404:6800:4001::11 v9.lscache8.c.youtube.com
2404:6800:4001::14 v10.lscache1.c.youtube.com
2404:6800:4001::14 v10.lscache2.c.youtube.com
2404:6800:4001::14 v10.lscache3.c.youtube.com
2404:6800:4001::14 v10.lscache4.c.youtube.com
2404:6800:4001::14 v10.lscache5.c.youtube.com
2404:6800:4001::14 v10.lscache6.c.youtube.com
2404:6800:4001:1::14 v10.lscache7.c.youtube.com
2404:6800:4001::14 v10.lscache8.c.youtube.com
2404:6800:4001::17 v11.lscache1.c.youtube.com
2404:6800:4001::17 v11.lscache2.c.youtube.com
2404:6800:4001::17 v11.lscache3.c.youtube.com
2404:6800:4001::17 v11.lscache4.c.youtube.com
2404:6800:4001::17 v11.lscache5.c.youtube.com
2404:6800:4001::17 v11.lscache6.c.youtube.com
2404:6800:4001::17 v11.lscache7.c.youtube.com
2404:6800:4001::17 v11.lscache8.c.youtube.com
2404:6800:4001::1a v12.lscache1.c.youtube.com
2404:6800:4001:1::1a v12.lscache2.c.youtube.com
2404:6800:4001::1a v12.lscache3.c.youtube.com
2404:6800:4001::1a v12.lscache4.c.youtube.com
2404:6800:4001::1a v12.lscache5.c.youtube.com
2404:6800:4001::1a v12.lscache6.c.youtube.com
2404:6800:4001::1a v12.lscache7.c.youtube.com
2404:6800:4001::1a v12.lscache8.c.youtube.com
2404:6800:4001::1d v13.lscache1.c.youtube.com
2404:6800:4001::1d v13.lscache2.c.youtube.com
2404:6800:4001::1d v13.lscache3.c.youtube.com
2404:6800:4001::1d v13.lscache4.c.youtube.com
2404:6800:4001:1::1d v13.lscache5.c.youtube.com
2404:6800:4001::1d v13.lscache6.c.youtube.com
2404:6800:4001::1d v13.lscache7.c.youtube.com
2404:6800:4001::1d v13.lscache8.c.youtube.com
2404:6800:4001::20 v14.lscache1.c.youtube.com
2404:6800:4001::20 v14.lscache2.c.youtube.com
2404:6800:4001::20 v14.lscache3.c.youtube.com
2404:6800:4001::20 v14.lscache4.c.youtube.com
2404:6800:4001::20 v14.lscache5.c.youtube.com
2404:6800:4001::20 v14.lscache6.c.youtube.com
2404:6800:4001::20 v14.lscache7.c.youtube.com
2404:6800:4001:1::20 v14.lscache8.c.youtube.com
2404:6800:4001::23 v15.lscache1.c.youtube.com
2404:6800:4001::23 v15.lscache2.c.youtube.com
2404:6800:4001::23 v15.lscache3.c.youtube.com
2404:6800:4001:1::23 v15.lscache4.c.youtube.com
2404:6800:4001::23 v15.lscache5.c.youtube.com
2404:6800:4001:1::23 v15.lscache6.c.youtube.com
2404:6800:4001::23 v15.lscache7.c.youtube.com
2404:6800:4001::23 v15.lscache8.c.youtube.com
2404:6800:4001::26 v16.lscache1.c.youtube.com
2404:6800:4001::26 v16.lscache2.c.youtube.com
2404:6800:4001:1::26 v16.lscache3.c.youtube.com
2404:6800:4001::26 v16.lscache4.c.youtube.com
2404:6800:4001::26 v16.lscache5.c.youtube.com
2404:6800:4001::26 v16.lscache6.c.youtube.com
2404:6800:4001::26 v16.lscache7.c.youtube.com
2404:6800:4001::26 v16.lscache8.c.youtube.com
2404:6800:4001::12 v17.lscache1.c.youtube.com
2404:6800:4001::12 v17.lscache2.c.youtube.com
2404:6800:4001::12 v17.lscache3.c.youtube.com
2404:6800:4001::12 v17.lscache4.c.youtube.com
2404:6800:4001::12 v17.lscache5.c.youtube.com
2404:6800:4001::12 v17.lscache6.c.youtube.com
2404:6800:4001::12 v17.lscache7.c.youtube.com
2404:6800:4001::12 v17.lscache8.c.youtube.com
2404:6800:4001:1::15 v18.lscache1.c.youtube.com
2404:6800:4001:1::15 v18.lscache2.c.youtube.com
2404:6800:4001::15 v18.lscache3.c.youtube.com
2404:6800:4001::15 v18.lscache4.c.youtube.com
2404:6800:4001::15 v18.lscache5.c.youtube.com
2404:6800:4001::15 v18.lscache6.c.youtube.com
2404:6800:4001:1::15 v18.lscache7.c.youtube.com
2404:6800:4001::15 v18.lscache8.c.youtube.com
2404:6800:4001:1::18 v19.lscache1.c.youtube.com
2404:6800:4001::18 v19.lscache2.c.youtube.com
2404:6800:4001:1::18 v19.lscache3.c.youtube.com
2404:6800:4001:1::18 v19.lscache4.c.youtube.com
2404:6800:4001::18 v19.lscache5.c.youtube.com
2404:6800:4001::18 v19.lscache6.c.youtube.com
2404:6800:4001::18 v19.lscache7.c.youtube.com
2404:6800:4001::18 v19.lscache8.c.youtube.com
2404:6800:4001::1b v20.lscache1.c.youtube.com
2404:6800:4001::1b v20.lscache2.c.youtube.com
2404:6800:4001::1b v20.lscache3.c.youtube.com
2404:6800:4001::1b v20.lscache4.c.youtube.com
2404:6800:4001:1::1b v20.lscache5.c.youtube.com
2404:6800:4001::1b v20.lscache6.c.youtube.com
2404:6800:4001::1b v20.lscache7.c.youtube.com
2404:6800:4001::1b v20.lscache8.c.youtube.com
2404:6800:4001::1e v21.lscache1.c.youtube.com
2404:6800:4001::1e v21.lscache2.c.youtube.com
2404:6800:4001:1::1e v21.lscache3.c.youtube.com
2404:6800:4001::1e v21.lscache4.c.youtube.com
2404:6800:4001::1e v21.lscache5.c.youtube.com
2404:6800:4001:1::1e v21.lscache6.c.youtube.com
2404:6800:4001::1e v21.lscache7.c.youtube.com
2404:6800:4001::1e v21.lscache8.c.youtube.com
2404:6800:4001::21 v22.lscache1.c.youtube.com
2404:6800:4001::21 v22.lscache2.c.youtube.com
2404:6800:4001::21 v22.lscache3.c.youtube.com
2404:6800:4001::21 v22.lscache4.c.youtube.com
2404:6800:4001:1::21 v22.lscache5.c.youtube.com
2404:6800:4001::21 v22.lscache6.c.youtube.com
2404:6800:4001::21 v22.lscache7.c.youtube.com
2404:6800:4001::21 v22.lscache8.c.youtube.com
2404:6800:4001:1::24 v23.lscache1.c.youtube.com
2404:6800:4001::24 v23.lscache2.c.youtube.com
2404:6800:4001::24 v23.lscache3.c.youtube.com
2404:6800:4001::24 v23.lscache4.c.youtube.com
2404:6800:4001::24 v23.lscache5.c.youtube.com
2404:6800:4001::24 v23.lscache6.c.youtube.com
2404:6800:4001::24 v23.lscache7.c.youtube.com
2404:6800:4001::24 v23.lscache8.c.youtube.com
2404:6800:4001::27 v24.lscache1.c.youtube.com
2404:6800:4001::27 v24.lscache2.c.youtube.com
2404:6800:4001::27 v24.lscache3.c.youtube.com
2404:6800:4001::27 v24.lscache4.c.youtube.com
2404:6800:4001:1::27 v24.lscache5.c.youtube.com
2404:6800:4001::27 v24.lscache6.c.youtube.com
2404:6800:4001::27 v24.lscache7.c.youtube.com
2404:6800:4001:1::27 v24.lscache8.c.youtube.com

#Android Google手机操作系统
2404:6800:8005::62 http://www.android.com
2404:6800:8005::62 android.com
2404:6800:8003::79 developer.android.com
2404:6800:8005::62 source.android.com

#The Go Programming Language Go 编程语言
2404:6800:8005::62 golang.org
2404:6800:8005::62 http://www.golang.org
2404:6800:8003::79 blog.golang.org

#Analytics 分析(Google所提供的网站流量统计服务)
2404:6800:8005::61 http://www.google-analytics.com
2404:6800:8005::61 *.google-analytics.com
2404:6800:8005::61 ssl.google-analytics.com

#DoubleClick 曾经世界最大的网络广告服务商,06年底被Google并购,现AdSense服务指向域名
2001:4860:b006::94 ad.doubleclick.net
2001:4860:b006::95 ad-g.doubleclick.net
2001:4860:b006::95 ad-apac.doubleclick.net
2404:6800:8005::9a googleads.g.doubleclick.net
2404:6800:8005::9b feedads.g.doubleclick.net
2404:6800:8005::90 fls.uk.doubleclick.net
2404:6800:8005::8e *.au.doubleclick.net
2404:6800:8005::8f *.de.doubleclick.net
2404:6800:8005::90 *.uk.doubleclick.net
2404:6800:8005::90 *.fr.doubleclick.net
2404:6800:8005::92 *.jp.doubleclick.net

#FeedBurner
2404:6800:8005::62 feedburner.google.com
2404:6800:8005::62 http://www.feedburner.com
2404:6800:8005::62 feeds.feedburner.com
2404:6800:8005::62 feeds2.feedburner.com
2404:6800:8005::76 feedproxy.google.com #Feed 跳转代理

#GoogleSyndication Google广告服务 AdWord(Google广告词,对关键字进行右侧付费推广),原AdSense服务指向域名
2404:6800:8005::62 http://www.googlesyndication.com
2404:6800:8005::62 pagead2.googlesyndication.com
2404:6800:8005::62 buttons.googlesyndication.com
2404:6800:8005::62 domains.googlesyndication.com
2404:6800:8005::98 tpc.googlesyndication.com

#GoogleSyndication Google广告服务 AdWord
2404:6800:8005::70 adwords.google.com
2404:6800:8005::41 adwords.google.sk

#GoogleADServices Google广告服务 AdSense
2404:6800:8005::60 http://www.googleadservices.com
2404:6800:8005::a4 pagead2.googleadservices.com
2404:6800:8005::a7 partner.googleadservices.com

#Panoramio
2001:4860:8010::8d http://www.panoramio.com
2001:4860:8010::8d static.panoramio.com

2404:6800:8005::62 pagead.google.com
2404:6800:8005::62 pagead.l.google.com
2404:6800:8005::62 pagead2.google.com

#Goo.gl Google短网址服务
2404:6800:8005::62 goo.gl
2404:6800:8005::b8 service.urchin.com

#Google.org Google 公益
2404:6800:8003::79 blog.google.org

[转]使用Apache Avro

源地址:http://www.infoq.com/cn/articles/ApacheAvro

作者 Boris Lublinsky 译者 王恒涛 发布于 2011年3月11日

Avro[1]是最近加入到Apache的Hadoop家族的项目之一。为支持数据密集型应用,它定义了一种数据格式并在多种编程语言中支持这种格式。

Avro提供的功能类似于其他编组系统,如Thrift、Protocol Buffers等。而Avro的主要不同之处在于[2]:

  • “动态类型:Avro无需生成代码。数据总是伴以模式定义,这样就可以在不生成代码、静态数据类型的情况下对数据进行所有处理。这样有利于构建通用的数据处理系统和语言。
  • 无标记数据:由于在读取数据时有模式定义,这就大大减少了数据编辑所需的类型信息,从而减少序列化空间。
  • 不用手动分配的字段ID:当数据模式发生变化,处理数据时总是同时提供新旧模式,差异就可以用字段名来做符号化的分析。”

由于性能高、基本代码少和产出数据量精简等特点,Avro周围展开了众多活动——许多NoSQL实现,包括Hadoop、Cssandra等,都把Avro整合到它们的客户端API和储存功能中;已经有人对Avro与其他流行序列化框架做了Benchmark测试并得到结果[3],但是,目前尚无可供人们学习使用Avro的代码示例[4]。

在这篇文章中我将试着描述我使用Avro的经验,特别是:

  • 如何建立组件化Avro模式,使用组件搭建整体模式,分别保存在多个文件中
  • 在Avro中实现继承
  • 在Avro中实现多态
  • Avro文档的向后兼容性。

组件化Apache Avro模式

如Avro规范所述[5]Avro文档模式定义成JSON文件。在当前Avro实现中,模式类需要一个文件(或字符串)来表示内部模式。同XML模式不一样,Avro当前版本不支持向模式文档中导入(一个或多个)子模式,这往往迫使开发者编写非常复杂的模式定义[6],并大大复杂化了模式的重用。下面的代码示例给出了一个有趣的拆分和组合模式文件的例子。它基于模式类提供的一个toString()方法,该方法返回一个JSON字符串以表示给定的模式定义。用这种办法,我提供了一个简单AvroUtils,能够自动完成上述功能:

package com.navteq.avro.common;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.HashMap;
import java.util.Map;

import org.apache.avro.Schema;

public class AvroUtils {

        private static Map<String, Schema> schemas = new HashMap<String, Schema>();

        private AvroUtils(){}

        public static void addSchema(String name, Schema schema){
               schemas.put(name, schema);

        }

        public static Schema getSchema(String name){
               return schemas.get(name);

        }

        public static String resolveSchema(String sc){

                String result = sc;
                for(Map.Entry<String, Schema> entry : schemas.entrySet())
                      result = replace(result, entry.getKey(),
                                        entry.getValue().toString());
                return result;

        }

        static String replace(String str, String pattern, String replace) {

                int s = 0;
                int e = 0;
                StringBuffer result = new StringBuffer();
                while ((e = str.indexOf(pattern, s)) >= 0) {
                result.append(str.substring(s, e));
                result.append(replace);
                s = e+pattern.length();

        }
        result.append(str.substring(s));
        return result.toString();

}

public static Schema parseSchema(String schemaString){

        String completeSchema = resolveSchema(schemaString);
        Schema schema = Schema.parse(completeSchema);
        String name = schema.getFullName();
        schemas.put(name, schema);
        return schema;

}

public static Schema parseSchema(InputStream in)throws IOException {

    StringBuffer out = new StringBuffer();
    byte[] b = new byte[4096];
    for (int n; (n = in.read(b)) != -1;) {
     out.append(new String(b, 0, n));
    }
    return parseSchema(out.toString());

}

public static Schema parseSchema(File file)throws IOException {

        FileInputStream fis = new FileInputStream(file);
        return parseSchema(fis);
    }
}

清单1 AvroUtils类

这个简单实现基于全局(静态)模式注册表,它由完全限定的模式名及与之对应的对象构成。对于每一个要解析的新模式,该实现在注册表中搜索已保存的完全限定模式名,并且在给定的模式中做字符串替换。模式字符串被解析之后,它的全名和模式名都存储在注册表中。

下面是一个简单的测试,展示如何使用这个类:

package com.navteq.avro.common;

import java.io.File;

import org.junit.Test;

public class AvroUtilsTest {

       private static final String schemaDescription =
         "{ \n" +
            " \"namespace\": \"com.navteq.avro\", \n" +
            " \"name\": \"FacebookUser\", \n" +
            " \"type\": \"record\",\n" +
            " \"fields\": [\n" +
                     " {\"name\": \"name\", \"type\": [\"string\", \"null\"] },\n" +
                     " {\"name\": \"num_likes\", \"type\": \"int\"},\n" +
                     " {\"name\": \"num_photos\", \"type\": \"int\"},\n" +
            " {\"name\": \"num_groups\", \"type\": \"int\"} ]\n" +
         "}";

       private static final String schemaDescriptionExt =
         " { \n" +
             " \"namespace\": \"com.navteq.avro\", \n" +
             " \"name\": \"FacebookSpecialUser\", \n" +
             " \"type\": \"record\",\n" +
             " \"fields\": [\n" +
                      " {\"name\": \"user\", \"type\": com.navteq.avro.FacebookUser },\n" +
                      " {\"name\": \"specialData\", \"type\": \"int\"} ]\n" +
          "}";

       @Test
       public void testParseSchema() throws Exception{

               AvroUtils.parseSchema(schemaDescription);
               Schema extended = AvroUtils.parseSchema(schemaDescriptionExt);
               System.out.println(extended.toString(true));
       }
}

清单2 AvroUtils测试

在这个测试中,第一个模式的完全限定名是com.navteq.avro.FacebookUser,替换正常运行并打印出以下结果:

{
  "type" : "record",
  "name" : "FacebookSpecialUser",
  "namespace" : "com.navteq.avro",
  "fields" : [ {
    "name" : "user",
    "type" : {
      "type" : "record",
      "name" : "FacebookUser",
      "fields" : [ {
        "name" : "name",
        "type" : [ "string", "null" ]
      }, {
        "name" : "num_likes",
        "type" : "int"
      }, {
        "name" : "num_photos",
        "type" : "int"
      }, {
        "name" : "num_groups",
        "type" : "int"
      } ]
    }
  }, {
    "name" : "specialData",
    "type" : "int"
  } ]
}

清单3 AvroUtilsTest的执行结果

使用Apache Avro实现继承

一种常见的定义数据的方法是通过继承——使用现有的数据定义并添加参数。虽然技术上Avro不支持继承[7],但要是实现一个类继承的结构非常简单。

如果我们有一个基类的定义——FacebookUser,如下:

{
"namespace": "com.navteq.avro",
"name": "FacebookUser",
"type": "record",
"fields": [
  {"name": "name", "type": ["string", "null"] },
  {"name": "num_likes", "type": "int"},
  {"name": "num_photos", "type": "int"},
  {"name": "num_groups", "type": "int"} ]
}

清单4 Facebook用户记录的定义

要创建一个FacebookSpecialUser定义非常简单,它大概是这样的:

{
    "namespace": "com.navteq.avro",
  "name": "FacebookSpecialUser",
  "type": "record",
  "fields": [
    {"name": "user", "type": com.navteq.avro.FacebookUser },
      {"name": "specialData", "type": "int"}
    ]
}

清单5 Facebook特殊的用户记录的定义

一个特殊的用户定义包含两个字段——Facebook的用户类型的用户和一个int类型的数据字段。

特殊Facebook用户的简单测试类如下:

package com.navteq.avro.inheritance;

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.EOFException;
import java.io.File;

import org.apache.avro.Schema;
import org.apache.avro.generic.GenericData;
import org.apache.avro.generic.GenericDatumReader;
import org.apache.avro.generic.GenericDatumWriter;
import org.apache.avro.generic.GenericRecord;
import org.apache.avro.io.BinaryEncoder;
import org.apache.avro.io.Decoder;
import org.apache.avro.io.DecoderFactory;
import org.apache.avro.io.Encoder;
import org.apache.avro.util.Utf8;
import org.junit.Before;
import org.junit.Test;

import com.navteq.avro.common.AvroUtils;

public class TestSimpleInheritance {

        private Schema schema;
        private Schema subSchema;

        @Before
        public void setUp() throws Exception {

                subSchema = AvroUtils.parseSchema(new File("resources/facebookUser.avro"));
                schema = AvroUtils.parseSchema(new File("resources/FacebookSpecialUser.avro"));

        }

        @Test
        public void testSimpleInheritance() throws Exception{
                ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
                GenericDatumWriter writer =
                            new GenericDatumWriter(schema);
                Encoder encoder = new BinaryEncoder(outputStream);

                GenericRecord subRecord1 = new GenericData.Record(subSchema);
                subRecord1.put("name", new Utf8("Doctor Who"));
                subRecord1.put("num_likes", 1);
                subRecord1.put("num_photos", 0);
                subRecord1.put("num_groups", 423);
                GenericRecord record1 = new GenericData.Record(schema);
                record1.put("user", subRecord1);
                record1.put("specialData", 1);

                writer.write(record1, encoder);

                GenericRecord subRecord2 = new GenericData.Record(subSchema);
                subRecord2.put("name", new org.apache.avro.util.Utf8("Doctor WhoWho"));
                subRecord2.put("num_likes", 2);
                subRecord2.put("num_photos", 0);
                subRecord2.put("num_groups", 424);
                GenericRecord record2 = new GenericData.Record(schema);
                record2.put("user", subRecord2);
                record2.put("specialData", 2);

                writer.write(record2, encoder);

                encoder.flush();

                ByteArrayInputStream inputStream =
                        new ByteArrayInputStream(outputStream.toByteArray());
                Decoder decoder = DecoderFactory.defaultFactory().
                        createBinaryDecoder(inputStream, null);
                GenericDatumReader reader =
                        new GenericDatumReader(schema);
                while(true){
                        try{
                              GenericRecord result = reader.read(null, decoder);
                              System.out.println(result);
                        }
                        catch(EOFException eof){
                                break;
                        }
                        catch(Exception ex){
                                ex.printStackTrace();
                        }
                }
        }
}[8]

清单6 一个特殊的Facebook用户的测试类

运行这个测试类产生预期的结果:

{"user": {"name": "Doctor Who", "num_likes": 1, "num_photos": 0,
"num_groups": 423}, "specialData": 1}
{"user": {"name": "Doctor WhoWho", "num_likes": 2, "num_photos": 0,
"num_groups": 424}, "specialData": 2}

清单7 Facebook特殊用户的测试结果

如果唯一需要的是有包含基础数据和其他参数的记录,此代码工作正常,但它不提供多态性——读取相同记录时,没办法知道到底读的是哪个类型的记录。

使用ApacheAvro实现多态性

与谷歌protocol buffers不同[9],Avro不支持可选参数[10],上述继承的实现不适应于多态性的实现——这是由于必须具备特殊的数据参数。幸运的是,Avro支持联合体,允许省略某些记录的参数。下面的定义可用于创建一个多态的纪录。对于基准纪录,我将使用清单4中描述的例子。为了扩展我们将使用以下两个定义:

{
     "namespace": "com.navteq.avro",
   "name": "FacebookSpecialUserExtension1",
   "type": "record",
   "fields": [
      {"name": "specialData1", "type": "int"}
     ]
}

清单8 首条扩展记录的定义

{
     "namespace": "com.navteq.avro",
   "name": "FacebookSpecialUserExtension2",
   "type": "record",
   "fields": [
      {"name": "specialData2", "type": "int"}
     ]
}

清单9 第二条扩展记录的定义

有了以上两个定义一个多态记录可以定义如下:

{
     "namespace": "com.navteq.avro",
   "name": "FacebookSpecialUser",
   "type": "record",
   "fields": [
      {"name": "type", "type": "string" },
      {"name": "user", "type": com.navteq.avro.FacebookUser },
        {"name": "extension1", "type":
            [com.navteq.avro.FacebookSpecialUserExtension1, "null"]},
        {"name": "extension2", "type":
            [com.navteq.avro.FacebookSpecialUserExtension2, "null"]}
      ]
}

清单10 Facebook特殊用户的多态定义

这里扩展1和扩展2都是可选的且二者皆可。为了使处理更简单,我添加了一个类型字段,可以用来明确定义的记录类型。

下面给出一个更好的多态记录的定义:

{
     "namespace": "com.navteq.avro",
   "name": "FacebookSpecialUser1",
   "type": "record",
   "fields": [
      {"name": "type", "type": "string" },
      {"name": "user", "type": com.navteq.avro.FacebookUser },
        {"name": "extension", "type":
            [com.navteq.avro.FacebookSpecialUserExtension1,
            com.navteq.avro.FacebookSpecialUserExtension2,
            "null"]}
      ]
}

清单11 Facebook特殊用户的改进多态定义

下面给出一个多态Facebook特殊用户的简单测试类:

package com.navteq.avro.inheritance;
package com.navteq.avro.inheritance;

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.EOFException;
import java.io.File;

import org.apache.avro.Schema;
import org.apache.avro.generic.GenericData;
import org.apache.avro.generic.GenericDatumReader;
import org.apache.avro.generic.GenericDatumWriter;
import org.apache.avro.generic.GenericRecord;
import org.apache.avro.io.BinaryEncoder;
import org.apache.avro.io.Decoder;
import org.apache.avro.io.DecoderFactory;
import org.apache.avro.io.Encoder;
import org.apache.avro.io.JsonDecoder;
import org.apache.avro.io.JsonEncoder;
import org.apache.avro.util.Utf8;
import org.junit.Before;
import org.junit.Test;

import com.navteq.avro.common.AvroUtils;

public class TestInheritance {

        private Schema FBUser;
        private Schema base;
        private Schema ext1;
        private Schema ext2;

        @Before
        public void setUp() throws Exception {

                 base = AvroUtils.parseSchema(new File("resources/facebookUser.avro"));
                 ext1 = AvroUtils.parseSchema(
                         new File("resources/FacebookSpecialUserExtension1.avro"));
                 ext2 = AvroUtils.parseSchema(
                         new File("resources/FacebookSpecialUserExtension2.avro"));
                 FBUser = AvroUtils.parseSchema(new File("resources/FacebooklUserInheritance.avro"));
}

        @Test
        public void testInheritanceBinary() throws Exception{
                 ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
                 GenericDatumWriter writer =
                         new GenericDatumWriter(FBUser);
                 Encoder encoder = new BinaryEncoder(outputStream);

                 GenericRecord baseRecord = new GenericData.Record(base);
                 baseRecord.put("name", new Utf8("Doctor Who"));
                 baseRecord.put("num_likes", 1);
                 baseRecord.put("num_photos", 0);
                 baseRecord.put("num_groups", 423);
                 GenericRecord FBrecord = new GenericData.Record(FBUser);
                 FBrecord.put("type", "base");
                 FBrecord.put("user", baseRecord);

                 writer.write(FBrecord, encoder);

                 baseRecord = new GenericData.Record(base);
                 baseRecord.put("name", new Utf8("Doctor WhoWho"));
                 baseRecord.put("num_likes", 1);
                 baseRecord.put("num_photos", 0);
                 baseRecord.put("num_groups", 423);
                 GenericRecord extRecord = new GenericData.Record(ext1);
                 extRecord.put("specialData1", 1);
                 FBrecord = new GenericData.Record(FBUser);
                 FBrecord.put("type", "extension1");
                 FBrecord.put("user", baseRecord);
                 FBrecord.put("extension", extRecord);

                 writer.write(FBrecord, encoder);

                 baseRecord = new GenericData.Record(base);
                 baseRecord.put("name", new org.apache.avro.util.Utf8("Doctor WhoWhoWho"));
                 baseRecord.put("num_likes", 2);
                 baseRecord.put("num_photos", 0);
                 baseRecord.put("num_groups", 424);
                 extRecord = new GenericData.Record(ext2);
                 extRecord.put("specialData2", 2);
                 FBrecord = new GenericData.Record(FBUser);
                 FBrecord.put("type", "extension2");
                 FBrecord.put("user", baseRecord);
                 FBrecord.put("extension", extRecord);

                 writer.write(FBrecord, encoder);

                 encoder.flush();                 byte[] data = outputStream.toByteArray();
                 ByteArrayInputStream inputStream = new ByteArrayInputStream(data);
                 Decoder decoder =
                        DecoderFactory.defaultFactory().createBinaryDecoder(inputStream, null);
                 GenericDatumReader reader =
                        new GenericDatumReader(FBUser);
                 while(true){
                        try{
                               GenericRecord result = reader.read(null, decoder);
                               System.out.println(result);
                        }
                        catch(EOFException eof){
                               break;
                        }
                        catch(Exception ex){
                               ex.printStackTrace();
                        }
                 }
        }
}

清单12 一条多态Facebook用户记录的测试类

运行这个测试类产生的预期结果:

{"type": "base", "user": {"name": "Doctor Who", "num_likes": 1, "num_photos":
0, "num_groups": 423}, "extension": null}
{"type": "extension1", "user": {"name": "Doctor WhoWho", "num_likes": 1,
"num_photos": 0, "num_groups": 423}, "extension": {"specialData1": 1}}
{"type": "extension2", "user": {"name": "Doctor WhoWhoWho", "num_likes": 2,
"num_photos": 0, "num_groups": 424}, "extension": {"specialData2": 2}}

清单13 多态Facebook用户记录测试的执行结果

使用ApacheAvro的向后兼容性

XML的优势之一就是当模式定义使用可选参数扩展时具备向后兼容性。我们介绍一个第三扩展记录的定义来测试Avro的这个特性:

{
     "namespace": "com.navteq.avro",
   "name": "FacebookSpecialUserExtension3",
   "type": "record",
   "fields": [
      {"name": "specialData3", "type": "int"}
   ]
}

清单14 第三扩展记录的定义

多态记录的变更定义如下:

{
     "namespace": "com.navteq.avro",
   "name": "FacebookSpecialUser11",
   "type": "record",
   "fields": [
     {"name": "type", "type": "string" },
     {"name": "user", "type": com.navteq.avro.FacebookUser },
       {"name": "extension", "type":
          [com.navteq.avro.FacebookSpecialUserExtension1,
          com.navteq.avro.FacebookSpecialUserExtension2,
          com.navteq.avro.FacebookSpecialUserExtension3,
          "null"]}
     ]
}

清单15 Facebook特殊用户的改进多态定义

为了能读取清单15中记录定义中的记录,清单12中的代码在修改后(但仍然用清单11中的记录定义来写数据)生成下列结果:

{"type": "base", "user": {"name": "Doctor Who", "num_likes": 1, "num_photos":
0, "num_groups": 423}, "extension": {"specialData3": 10}}
java.lang.ArrayIndexOutOfBoundsException
      at java.lang.System.arraycopy(Native Method)
      at org.apache.avro.io.BinaryDecoder.doReadBytes(BinaryDecoder.java:331)       at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:265)       at org.apache.avro.io.ValidatingDecoder.readString(ValidatingDecoder.java:99)       at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:318)       at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:312)       at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:120)       at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:142)       at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:114)       at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:105)       at com.navteq.avro.inheritance.TestInheritance.testInheritanceBinary(TestInheritance.java:119)       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)       at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)       at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)       at java.lang.reflect.Method.invoke(Unknown Source)

清单16 多态Facebook用户记录对扩展定义测试的执行结果

虽然Avro提供了一个能够解决这个问题的API——GenericDatumReader<GenericRecord>构造函数可以使用两个参数——分别用来写记录与读记录的模式,但这不总是解决向后兼容问题的一定可行的方法,因为它必须要记住用来写每条记录的所有模式。

一个更合适的解决方案是:从二进制编码器/解码器(它建立记录的二进制表象)切换到JSON编码器/解码器。在这种情况下代码有效,并产生以下结果:

{"type": "base", "user": {"name": "Doctor Who", "num_likes": 1, "num_photos":
0, "num_groups": 423}, "extension": null}
{"type": "extension1", "user": {"name": "Doctor WhoWho", "num_likes": 1,
"num_photos": 0, "num_groups": 423}, "extension": {"specialData1": 1}}
{"type": "extension2", "user": {"name": "Doctor WhoWhoWho", "num_likes": 2,
"num_photos": 0, "num_groups": 424}, "extension": {"specialData2": 2}}

清单17 应用JSON编码多态Facebook用户记录对扩展定义测试的执行结果

通过JSON的编码器,实际的数据转换成JSON:

{"type":"base","user":{"name":{"string":"Doctor
Who"},"num_likes":1,"num_photos":0,"num_groups":423},"extension":null}
{"type":"extension1","user":{"name":{"string":"Doctor
WhoWho"},"num_likes":1,"num_photos":0,"num_groups":423},"extension":{"FacebookSpecialUserExtension1":{"specialData1":1}}}
{"type":"extension2","user":{"name":{"string":"Doctor
WhoWhoWho"},"num_likes":2,"num_photos":0,"num_groups":424},"extension":{"FacebookSpecialUserExtension2":{"specialData2":2}}}

清单18 JSON编码下所转换的数据

还有一个需要考虑的问题,在我的测试中,同样的数据在二进制编码下产生的Avro记录的大小为89字节,而在JSON编码下产生了473字节。

结论

当前实现的Avro不直接支持模式的组件化或模式组件重用,但像本文中描述的一个简单的框架能够为这些特性提供支持。尽管Avro不直接支持多态性,文中利用适当的模式设计可以简单地实现多态数据模式。至于真正意义上向后兼容性问题,只有使用JSON编码的时候Avro才支持[11]。最后一点和Avro的特性没有多大关系,更多的是来自JSON。最后一点严重限制了Avro适用性(如果向后兼容性是必须的),使其使用范围局限为一种高级的JSON编组和处理API。

除了一般的(这里所用到的)Avro方法,也可以使用一个特定的Avro。这时候,可通过(Avro)生产特定的记录而非普通的记录。尽管有些说法指出[12]Avro的特定应用能够获得性能提升,以我使用当前Avro版本(1.4.1)的经验来看,两者有着同样的性能表现。


[1] http://hadoop.apache.org/avro/

[2] http://avro.apache.org/docs/1.4.1/

[3] http://code.google.com/p/thrift-protobuf-compare/wiki/Benchmarking

[4] 我在Avro编组Avro Map Reduce发现的几篇

[5] http://avro.apache.org/docs/current/spec.html

[6] 很有趣,Avro IDL支持子IDL

[7] 与明确支持类型定义中的基类型的XML不同

[8] 关于上面的代码需要指出的一点是,模式解析是在构造函数中完成的,原因在于构造解析是Avro实现中最昂贵的操作。

[9] http://code.google.com/p/protobuf/

[10] Avro支持“Null”,这不同于可选参数,在Avro中“Null”表示某个属性没有值

[11] 或者如果有旧版本的模式

[12] http://code.google.com/p/thrift-protobuf-compare/wiki/Benchmarking

查看英文原文:Using Apache Avro


感谢马国耀对本文的审校。

给InfoQ中文站投稿或者参与内容翻译工作,请邮件至editors@cn.infoq.com。也欢迎大家加入到InfoQ中文站用户讨论组中与我们的编辑和其他读者朋友交流。