博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
node扩展 memwatch分析
阅读量:7234 次
发布时间:2019-06-29

本文共 13801 字,大约阅读时间需要 46 分钟。

介绍

memwatch是一个c++扩展,主要用来观察nodejs内存泄露问题,基本用法如下:

const memwatch = require('@airbnb/memwatch');function LeakingClass() {}memwatch.gc();var arr = [];var hd = new memwatch.HeapDiff();for (var i = 0; i < 10000; i++) arr.push(new LeakingClass);var hde = hd.end();console.log(JSON.stringify(hde, null, 2));复制代码

实现分析

分析的版本为。首先从binding.gyp开始入手:

{  'targets': [    {      'target_name': 'memwatch',      'include_dirs': [        "

这份配置表示其生成的目标是memwatch.node,源码是src目录下的heapdiff.ccinit.ccmemwatch.ccutil.cc,在项目编译的过程中还需要include额外的nan目录,nan目录通过执行node -e "require('nan')按照node模块系统寻找nan依赖,<! 表示后面是一条指令。

memwatch的入口函数在init.cc文件中,通过NODE_MODULE(memwatch, init);进行声明。当执行require('@airbnb/memwatch')的时候会首先调用init函数:

void init (v8::Handle
target){ Nan::HandleScope scope; heapdiff::HeapDiff::Initialize(target); Nan::SetMethod(target, "upon_gc", memwatch::upon_gc); Nan::SetMethod(target, "gc", memwatch::trigger_gc); Nan::AddGCPrologueCallback(memwatch::before_gc); Nan::AddGCEpilogueCallback(memwatch::after_gc);}复制代码

init函数的入口参数v8:Handle<v8:Object> target可以类比nodejs中的module.exportsexports对象。函数内部做的实现可以分为三块,初始化target、给target绑定upon_gcgc两个函数、在nodejs的gc前后分别挂上对应的钩子函数。

Initialize实现

heapdiff.cc文件中来看heapdiff::HeapDiff::Initialize(target);的实现。

void heapdiff::HeapDiff::Initialize ( v8::Handle
target ){ Nan::HandleScope scope; v8::Local
t = Nan::New
(New); t->InstanceTemplate()->SetInternalFieldCount(1); t->SetClassName(Nan::New
("HeapDiff").ToLocalChecked()); Nan::SetPrototypeMethod(t, "end", End); target->Set(Nan::New
("HeapDiff").ToLocalChecked(), t->GetFunction());}复制代码

Initialize函数中创建一个叫做HeapDiff的函数t,同时在t的原型链上绑了end方法,使得js层面可以执行vat hp = new memwatch.HeapDiff();hp.end()

new memwatch.HeapDiff实现

当js执行new memwatch.HeapDiff();的时候,c++层面会执行heapdiff::HeapDiff::New函数,去掉注释和不必要的宏,New函数精简如下:

NAN_METHOD(heapdiff::HeapDiff::New){    if (!info.IsConstructCall()) {        return Nan::ThrowTypeError("Use the new operator to create instances of this object.");    }    Nan::HandleScope scope;    HeapDiff * self = new HeapDiff();    self->Wrap(info.This());    s_inProgress = true;    s_startTime = time(NULL);        self->before = v8::Isolate::GetCurrent()->GetHeapProfiler()->TakeHeapSnapshot(NULL);    s_inProgress = false;    info.GetReturnValue().Set(info.This());}复制代码

可以看到用户在js层面执行var hp = new memwatch.HeapDiff();的时候,c++层面会调用nodejs中的v8的api对对堆上内存打一个snapshot保存到self->before中,并将当前对象返回出去。

memwatch.HeapDiff.End实现

当用户执行hp.end()的时候,会执行原型链上的end方法,也就是c++的heapdiff::HeapDiff::End方法。同样去掉冗余的注释以及宏,End方法可以精简如下:

NAN_METHOD(heapdiff::HeapDiff::End){    Nan::HandleScope scope;    HeapDiff *t = Unwrap
( info.This() ); if (t->ended) { return Nan::ThrowError("attempt to end() a HeapDiff that was already ended"); } t->ended = true; s_inProgress = true; t->after = v8::Isolate::GetCurrent()->GetHeapProfiler()->TakeHeapSnapshot(NULL); s_inProgress = false; v8::Local
comparison = compare(t->before, t->after); ((HeapSnapshot *) t->before)->Delete(); t->before = NULL; ((HeapSnapshot *) t->after)->Delete(); t->after = NULL; info.GetReturnValue().Set(comparison);}复制代码

在End函数中,拿到当前的HeapDiff对象之后,再对当前的堆上内存再打一个snapshot,调用compare函数对前后两个snapshot对比后得到comparison后,将前后两次snapshot对象释放掉,并将结果通知给js。

下面分析下compare函数的具体实现: compare函数内部会递归调用buildIDSet函数得到最终堆快照的diff结果。

static v8::Local
compare(const v8::HeapSnapshot * before, const v8::HeapSnapshot * after){ Nan::EscapableHandleScope scope; int s, diffBytes; Local
o = Nan::New
(); // first let's append summary information Local
b = Nan::New
(); b->Set(Nan::New("nodes").ToLocalChecked(), Nan::New(before->GetNodesCount())); //b->Set(Nan::New("time"), s_startTime); o->Set(Nan::New("before").ToLocalChecked(), b); Local
a = Nan::New
(); a->Set(Nan::New("nodes").ToLocalChecked(), Nan::New(after->GetNodesCount())); //a->Set(Nan::New("time"), time(NULL)); o->Set(Nan::New("after").ToLocalChecked(), a); // now let's get allocations by name set
beforeIDs, afterIDs; s = 0; buildIDSet(&beforeIDs, before->GetRoot(), s); b->Set(Nan::New("size_bytes").ToLocalChecked(), Nan::New(s)); b->Set(Nan::New("size").ToLocalChecked(), Nan::New(mw_util::niceSize(s).c_str()).ToLocalChecked()); diffBytes = s; s = 0; buildIDSet(&afterIDs, after->GetRoot(), s); a->Set(Nan::New("size_bytes").ToLocalChecked(), Nan::New(s)); a->Set(Nan::New("size").ToLocalChecked(), Nan::New(mw_util::niceSize(s).c_str()).ToLocalChecked()); diffBytes = s - diffBytes; Local
c = Nan::New
(); c->Set(Nan::New("size_bytes").ToLocalChecked(), Nan::New(diffBytes)); c->Set(Nan::New("size").ToLocalChecked(), Nan::New(mw_util::niceSize(diffBytes).c_str()).ToLocalChecked()); o->Set(Nan::New("change").ToLocalChecked(), c); // before - after will reveal nodes released (memory freed) vector
changedIDs; setDiff(beforeIDs, afterIDs, changedIDs); c->Set(Nan::New("freed_nodes").ToLocalChecked(), Nan::New
(changedIDs.size())); // here's where we'll collect all the summary information changeset changes; // for each of these nodes, let's aggregate the change information for (unsigned long i = 0; i < changedIDs.size(); i++) { const HeapGraphNode * n = before->GetNodeById(changedIDs[i]); manageChange(changes, n, false); } changedIDs.clear(); // after - before will reveal nodes added (memory allocated) setDiff(afterIDs, beforeIDs, changedIDs); c->Set(Nan::New("allocated_nodes").ToLocalChecked(), Nan::New
(changedIDs.size())); for (unsigned long i = 0; i < changedIDs.size(); i++) { const HeapGraphNode * n = after->GetNodeById(changedIDs[i]); manageChange(changes, n, true); } c->Set(Nan::New("details").ToLocalChecked(), changesetToObject(changes)); return scope.Escape(o);}复制代码

该函数中构造了两个对象b(before)、a(after)用于保存前后两个快照的详细信息。用一个js对象描述如下:

// b(before) / a(after){    nodes: // heap snapshot中对象节点个数    size_bytes: // heap snapshot的对象大小(bytes)    size: // heap snapshot的对象大小(kb、mb)    }复制代码

进一步对前后两次的快照进行分析可以得到o,o中的before、after对象就是前后两次的snapshot对象的引用:

// o {    before: { // before的堆snapshot        nodes:        size_bytes:        size:     },    after: { // after的堆snapshot        nodes:        size_bytes:        size:     },    change: {        freed_nodes: // gc掉的节点数量        allocated_nodes: // 新增节点数量        details: [ // 按照类型String、Array聚合出来的详细信息            {                Array : {                    what: // 类型                    size_bytes: // 字节数bytes                    size: // kb、mb                    +: // 新增数量                    -: // gc数量                }            },            {}        ]    }}复制代码

得到两次snapshot对比的结果后将o返回出去,在End函数中通过info.GetReturnValue().Set(comparison);将结果传递到js层面。

下面来具体说下compare函数中的buildIDSet、setDiff以及manageChange函数的实现。 buildIDSet的用法:buildIDSet(&beforeIDs, before->GetRoot(), s);,该函数会从堆snapshot的根节点出发,递归的寻找所有能够访问的子节点,加入到集合seen中,做DFS统计所有可达节点的同时,也会对所有节点的shallowSize(对象本身占用的内存,不包括引用的对象所占内存)进行累加,统计当前堆所占用的内存大小。其具体实现如下:

static void buildIDSet(set
* seen, const HeapGraphNode* cur, int & s){ Nan::HandleScope scope; if (seen->find(cur->GetId()) != seen->end()) { return; } if (cur->GetType() == HeapGraphNode::kObject && handleToStr(cur->GetName()).compare("HeapDiff") == 0) { return; } s += cur->GetShallowSize(); seen->insert(cur->GetId()); for (int i=0; i < cur->GetChildrenCount(); i++) { buildIDSet(seen, cur->GetChild(i)->GetToNode(), s); }}复制代码

setDiff函数用法:setDiff(beforeIDs, afterIDs, changedIDs);主要用来计算集合差集用的,具体实现很简单,这里直接贴代码,不再赘述:

typedef set
idset;// why doesn't STL work?// XXX: improve this algorithmvoid setDiff(idset a, idset b, vector
&c){ for (idset::iterator i = a.begin(); i != a.end(); i++) { if (b.find(*i) == b.end()) c.push_back(*i); }}复制代码

manageChange函数用法:manageChange(changes, n, false);,其作用在于做数据的聚合。对某个指定的set,按照set中对象的类型,聚合出每种对象创建了多少、销毁了多少,实现如下:

static void manageChange(changeset & changes, const HeapGraphNode * node, bool added){    std::string type;    switch(node->GetType()) {        case HeapGraphNode::kArray:            type.append("Array");            break;        case HeapGraphNode::kString:            type.append("String");            break;        case HeapGraphNode::kObject:            type.append(handleToStr(node->GetName()));            break;        case HeapGraphNode::kCode:            type.append("Code");            break;        case HeapGraphNode::kClosure:            type.append("Closure");            break;        case HeapGraphNode::kRegExp:            type.append("RegExp");            break;        case HeapGraphNode::kHeapNumber:            type.append("Number");            break;        case HeapGraphNode::kNative:            type.append("Native");            break;        case HeapGraphNode::kHidden:        default:            return;    }    if (changes.find(type) == changes.end()) {        changes[type] = change();    }    changeset::iterator i = changes.find(type);    i->second.size += node->GetShallowSize() * (added ? 1 : -1);    if (added) i->second.added++;    else i->second.released++;    return;}复制代码

upon_gcgc实现

这两个方法的在init函数中声明如下:

Nan::SetMethod(target, "upon_gc", memwatch::upon_gc);Nan::SetMethod(target, "gc", memwatch::trigger_gc);复制代码

先看gc方法的实现,实际上对应memwatch::trigger_gc,实现如下:

NAN_METHOD(memwatch::trigger_gc) {    Nan::HandleScope scope;    int deadline_in_ms = 500;    if (info.Length() >= 1 && info[0]->IsNumber()) {        deadline_in_ms = (int)(info[0]->Int32Value());     }    Nan::IdleNotification(deadline_in_ms);    Nan::LowMemoryNotification();    info.GetReturnValue().Set(Nan::Undefined());}复制代码

通过Nan::IdleNotificationNan::LowMemoryNotification触发v8的gc功能。 再来看upon_gc方法,该方法实际上会绑定一个函数,当执行到gc方法时,就会触发该函数:

NAN_METHOD(memwatch::upon_gc) {    Nan::HandleScope scope;    if (info.Length() >= 1 && info[0]->IsFunction()) {        uponGCCallback = new UponGCCallback(info[0].As
()); } info.GetReturnValue().Set(Nan::Undefined());}复制代码

其中info[0]就是用户传入的回调函数。调用new UponGCCallback的时候,其对应的构造函数内部会执行:

UponGCCallback(v8::Local
callback_) : Nan::AsyncResource("memwatch:upon_gc") { callback.Reset(callback_);}复制代码

把用户传入的callback_函数设置到UponGCCallback类的成员变量callback上。upon_gc回调的触发与gc的钩子有关,详细看下一节分析。

gc前、后钩子函数的实现

gc钩子的挂载如下:

Nan::AddGCPrologueCallback(memwatch::before_gc);Nan::AddGCEpilogueCallback(memwatch::after_gc);复制代码

先来看memwatch::before_gc函数的实现,内部给gc开始记录了时间:

NAN_GC_CALLBACK(memwatch::before_gc) {    currentGCStartTime = uv_hrtime();}复制代码

再来看memwatch::after_gc函数的实现,内部会在gc后记录gc的结果到GCStats结构体中:

struct GCStats {    // counts of different types of gc events    size_t gcScavengeCount; // gc 扫描次数    uint64_t gcScavengeTime; // gc 扫描事件    size_t gcMarkSweepCompactCount; //  gc标记清除整理的个数    uint64_t gcMarkSweepCompactTime; // gc标记清除整理的时间    size_t gcIncrementalMarkingCount;  // gc增量标记的个数    uint64_t gcIncrementalMarkingTime; // gc增量标记的时间    size_t gcProcessWeakCallbacksCount; // gc处理weakcallback的个数    uint64_t gcProcessWeakCallbacksTime; // gc处理weakcallback的时间};复制代码

对gc请求进行统计后,通过v8的api获取堆的使用情况,最终将结果保存到barton中,barton内部维护了一个uv_work_t的变量req,req的data字段指向barton对象本身。

NAN_GC_CALLBACK(memwatch::after_gc) {    if (heapdiff::HeapDiff::InProgress()) return;    uint64_t gcEnd = uv_hrtime();    uint64_t gcTime = gcEnd - currentGCStartTime;    switch(type) {        case kGCTypeScavenge:            s_stats.gcScavengeCount++;            s_stats.gcScavengeTime += gcTime;            return;        case kGCTypeMarkSweepCompact:        case kGCTypeAll:            break;    }    if (type == kGCTypeMarkSweepCompact) {        s_stats.gcMarkSweepCompactCount++;        s_stats.gcMarkSweepCompactTime += gcTime;        Nan::HandleScope scope;        Baton * baton = new Baton;        v8::HeapStatistics hs;        Nan::GetHeapStatistics(&hs);        timeval tv;        gettimeofday(&tv, NULL);        baton->gc_ts = (tv.tv_sec * 1000000) + tv.tv_usec;        baton->total_heap_size = hs.total_heap_size();        baton->total_heap_size_executable = hs.total_heap_size_executable();        baton->req.data = (void *) baton;        uv_queue_work(uv_default_loop(), &(baton->req),            noop_work_func, (uv_after_work_cb)AsyncMemwatchAfter);    }}复制代码

在前面工作完成的基础上,将结果丢到libuv的loop中,等到合适的实际触发回调函数,在回调函数中可以拿到req对象,通过访问req.data对其做强制类型装换可以得到barton对象,在loop的回调函数中,将barton中封装的数据依次取出来,保存到stats对象中,并调用uponGCCallback的Call方法,传入字面量stats和stats对象。

static void AsyncMemwatchAfter(uv_work_t* request) {    Nan::HandleScope scope;    Baton * b = (Baton *) request->data;    // if there are any listeners, it's time to emit!    if (uponGCCallback) {        Local
argv[2]; Local
stats = Nan::New
(); stats->Set(Nan::New("gc_ts").ToLocalChecked(), javascriptNumber(b->gc_ts)); stats->Set(Nan::New("gcProcessWeakCallbacksCount").ToLocalChecked(), javascriptNumberSize(b->stats.gcProcessWeakCallbacksCount)); stats->Set(Nan::New("gcProcessWeakCallbacksTime").ToLocalChecked(), javascriptNumber(b->stats.gcProcessWeakCallbacksTime)); stats->Set(Nan::New("peak_malloced_memory").ToLocalChecked(), javascriptNumberSize(b->peak_malloced_memory)); stats->Set(Nan::New("gc_time").ToLocalChecked(), javascriptNumber(b->gc_time)); // the type of event to emit argv[0] = Nan::New("stats").ToLocalChecked(); argv[1] = stats; uponGCCallback->Call(2, argv); } delete b;}复制代码

最后在Call函数的内部调用js传入的callback_函数,并将字面量stats和stats对象传递到js层面,供上层用户使用。

void Call(int argc, Local
argv[]) { v8::Isolate *isolate = v8::Isolate::GetCurrent(); runInAsyncScope(isolate->GetCurrentContext()->Global(), Nan::New(callback), argc, argv);}复制代码

转载地址:http://uxlfm.baihongyu.com/

你可能感兴趣的文章
Mono产品生命周期
查看>>
FetchType与FetchMode的区别
查看>>
GCD &amp;&amp; Run Loops学习笔记
查看>>
SQLite Learning、SQL Query Optimization In Multiple Rule
查看>>
[ios]sqlite轻量级数据库学习连接
查看>>
它们的定义ListView,实现Item除去滑动和滑出菜单效果
查看>>
2015第我35周三
查看>>
Web前端研发工程师编程能力飞升之路
查看>>
C#编程总结(十)字符转码
查看>>
linux gcc头文件搜索路径
查看>>
对线程的理解
查看>>
更改linux swappiness 提高物理内存使用率
查看>>
J-Link GDB Server Command
查看>>
如何用.NET生成二维码?
查看>>
C++一个简单的手柄类模板
查看>>
The Water Problem(排序)
查看>>
atitit.无线上网卡 无法搜索WiFi 解决无线路由器信号不能被连接
查看>>
C#进阶系列——DDD领域驱动设计初探(三):仓储Repository(下)
查看>>
android 电容屏(三):驱动调试之驱动程序分析篇
查看>>
数字签名时间戳服务器的原理 !
查看>>