mongodb内存监控和管理

多种方式监控和管理mongoDB,查看资源占用情况和运行状态

8G的腾讯云服务器内存告急,谁让自己采集了两个电商平台的全站商品sku呢?来看下第一个罪魁祸首mongodb,运行命令 top -p $(pidof mongod) 的结果:

top -p $(pidof mongod)内存使用情况

https://www.bmc.com/blogs/mongodb-memory-usage-and-management/

这里可以看出mongodb用了一般内存,另一半是redis-server,好家伙,都是耗内存的主儿!

当然也可以在mongodb终端查看:

> db.serverStatus().mem
{ "bits" : 64, "resident" : 2642, "virtual" : 4628, "supported" : true }
  • resident—amount of actual physical memory (RAM) used by a process. 一个进程使用的实际物理内存
  • virtual—RAM plus memory that has extended to the file system cache, i.e. virtual memory. 虚拟内存
  • mapped—MongoDB since version 3.2 does not do memory mapping of files anymore. That was used by the previous memory management module called MMAPv1. Now it uses WiredTiger by default. 映射内存,MongoDB 3.2+不再使用,而是开始用WiredTiger

当然也可以在mongodb terminal运行下面两个命令:

var mem = db.serverStatus().tcmalloc;

mem.tcmalloc.formattedString
> var mem = db.serverStatus().tcmalloc;
> mem.tcmalloc.formattedString
------------------------------------------------
MALLOC:     2760578144 ( 2632.7 MiB) Bytes in use by application
MALLOC: +    416550912 (  397.3 MiB) Bytes in page heap freelist
MALLOC: +     17226192 (   16.4 MiB) Bytes in central cache freelist
MALLOC: +        33792 (    0.0 MiB) Bytes in transfer cache freelist
MALLOC: +     12516816 (   11.9 MiB) Bytes in thread cache freelists
MALLOC: +     13762560 (   13.1 MiB) Bytes in malloc metadata
MALLOC:   ------------
MALLOC: =   3220668416 ( 3071.5 MiB) Actual memory used (physical + swap)
MALLOC: +     58572800 (   55.9 MiB) Bytes released to OS (aka unmapped)
MALLOC:   ------------
MALLOC: =   3279241216 ( 3127.3 MiB) Virtual address space used
MALLOC:
MALLOC:          67767              Spans in use
MALLOC:             45              Thread heaps in use
MALLOC:           4096              Tcmalloc page size
------------------------------------------------
Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
Bytes released to the OS take up virtual address space but no physical memory.

那有没有更炫酷的方式呢?当然有!而且还是免费的。在MongoDB命令行终端运行:

> db.enableFreeMonitoring()
{
	"state" : "enabled",
	"message" : "To see your monitoring data, navigate to the unique URL below. Anyone you share the URL with will also be able to view this page. You can disable monitoring at any time by running db.disableFreeMonitoring().",
	"url" : "https://cloud.mongodb.com/freemonitoring/cluster/xxxxxxxxxxxxxxxxxxx",
	"userReminder" : "",
	"ok" : 1
}

为了安全起见我的监控url用xxxxxx替换了,一睹为快:

mongoDB免费监控web

更多监控指数可以自己看下,对于我来说可以清楚地看到爬虫运行状态和mongodb资源占用情况。

redis ConnectionResetError: [Errno 104] Connection reset by peer

链接空闲导致redis ConnectionResetError

在爬取京东商品评论时,开始我开启了redis_pipeline,所以没有出现redis connection reset问题,后来感觉没必要就关闭了redis_pipeline,就出现了redis ConnectionResetError: [Errno 104] Connection reset by peer,爬取一段时间后就会出现这个问题。

原因很好理解,一个商品如果有100页评论,每爬取一个商品sku才需要读取一次redis,所以有时候间隔很长,redis链接断开了。开始我设置了超时120和超时重试并且开启了health_check_interval,没有作用,看了下源码,发现有个socket_keepalive_options,开启试试吧。

#自定义的redis参数(连接超时之类的)
REDIS_PARAMS  = {    
    'socket_timeout': 10,
    'socket_connect_timeout': 10,
    'retry_on_timeout': True,
    'health_check_interval': 10,
    'socket_keepalive_options': True
    }

出现这种问题一般都是网络问题,或者自己代码不合理。