我们经常会收藏到一些markdown的文章,文章中也经常会有一些高信息量的图片,这些图片往往采用的第三方图床,也不知道哪天会失效,所以我们考虑将这些图片批量替换到我们自己的付费云服务,比如又拍云、七牛云、阿里云等,当然我们也可以使用他们的部分免费产品。

本文介绍python脚本自动替换markdown文件中的网络图片为七牛云地址。

目录

脚本介绍

将md文件中网络图片替换成七牛云地址。

  • 读取markdown文件(输入)
  • 提取所有图片链接
  • 引入七牛从指定URL抓取资源
  • 遍历所有图片,得到新图片地址
  • 替换文件中图片地址
  • 备份原文件 @todo
  • 覆盖文件

usage:

& python qiniu-img-class.py > i:/qiniu-img-log

代码实现

准备:安装qiniu模块

import re
import time
import random
import sys
import json
from qiniu import Auth
from qiniu import BucketManager

# '''markdown 文件中网络图片抓取保存七牛'''
class MdImgFetch:
    
    __qiniuConfigFile = "qiniu.json" # 七牛云配置appid key domain bucket之类
    __qiniuConfig = []
    __theFile = ""
    __bucket = ''    
    
    def __init__(self,theFile):
        if theFile:
            self.__theFile = theFile
        else:
            print("请指定要处理的文件路径")
            sys.exit()

    def fetch(self):
        self.__qiniuConfig = qiniuConfig =  self.__getQiniuConfig('9ong')
        qiniu = Auth(qiniuConfig["access_key"], qiniuConfig["secret_key"])
        self.__bucket = BucketManager(qiniu)            
        
        theList = []
        theContent = ''
        with open(self.__theFile,'r',encoding='utf-8') as rf:
            for line in rf:
                matchList = []
                matchList = re.findall(r'!\[.*\]\((.+)\)',line)            
                if matchList:
                    for originUrl in matchList:
                        print(originUrl)                    
                        # 七牛抓取网络图片
                        newUrl = self.__fetchImg(originUrl)
                        print(newUrl)
                        # 替换line
                        line = line.replace(originUrl,newUrl)

                theList.append(line)
        with open(self.__theFile,'w',encoding='utf-8') as wf:
            for oneLine in theList:
                theContent += oneLine
            if theContent:
                wf.write(theContent)
                print("\n"+self.__theFile+"\n===========================\n")
                # print(theContent)

    def __getQiniuConfig(self,bucketName):
        with open(self.__qiniuConfigFile) as f:
            config = json.load(f)
            if bucketName:
                return config[bucketName]
            else:
                return config        


    def __fetchImg(self,url):       

        key = self.__getKey()
        url = self.__replaceWebp(url)    

        ret, info = self.__bucket.fetch(url, self.__qiniuConfig["bucket_name"], key)
        # print(info)
        if ret['key']==key:
            return self.__qiniuConfig['public_domain'] + key
        else:
            return url
        # assert ret['key'] == key

    def __getKey(self):
        return "images/page/md-" + str(time.time()) + "-" + str(random.randint(1,1000)) + ".jpg"
        
    def __replaceWebp(self,src):
        a = ".qpic.cn"    
        if a in src:
            src =src.replace("tp=webp", "tp=jpg",1)

        return src     


if __name__ == '__main__':
    theFile = input("输入文章绝对路径:")
    # theFile = r'I:\src\xxx\9ong\其他\xxx.md'
    mif = MdImgFetch(theFile)
    mif.fetch()


日志效果

先输出原图片地址,后输出替换后的新地址:

PS 9ong> & python qiniu-img-class.py
输入文章绝对路径:I:\src\hugo\9ong\产品设计\分销体系设计.md
http://image.woshipm.com/wp-files/2018/10/z74RWm4XZ67KnRuZyCel.png
http://image.xxx.com/images/page/md-1590570848.7889807-373.jpg
http://image.woshipm.com/wp-files/2018/10/lH2aSt3cMvDQXFiSAgG8.png
http://image.xxx.com/images/page/md-1590570850.1272373-899.jpg
http://image.woshipm.com/wp-files/2018/10/2s5wrGOCZTXLTzSSVYNq.png
http://image.xxx.com/images/page/md-1590570850.504021-523.jpg
http://image.woshipm.com/wp-files/2018/10/bXO1QRwZkuHfq3AVKrXC.png
http://image.xxx.com/images/page/md-1590570850.622559-707.jpg
http://image.woshipm.com/wp-files/2018/10/z82gJyh28yfTy3GbMn89.jpg
http://image.xxx.com/images/page/md-1590570850.8648922-346.jpg
http://image.woshipm.com/wp-files/2018/10/oczEHtR1KBX44eD6Ch91.png
http://image.xxx.com/images/page/md-1590570851.0835824-794.jpg
http://image.woshipm.com/wp-files/2018/10/tlt9CFogZoMLIUkYJkmp.png
http://image.xxx.com/images/page/md-1590570851.142831-45.jpg
http://image.woshipm.com/wp-files/2018/10/FSAv98FLdEVs9LFySDzQ.png
http://image.xxx.com/images/page/md-1590570851.3865244-978.jpg
http://image.woshipm.com/wp-files/2018/10/ikjFO3jMyP2IGgcEERmB.png
http://image.xxx.com/images/page/md-1590570851.6726031-280.jpg
http://image.woshipm.com/wp-files/2018/10/7lXFeBCH87NKZmnbskTU.png
http://image.xxx.com/images/page/md-1590570851.9097345-35.jpg
http://image.woshipm.com/wp-files/2018/10/ymprUlq0GG4arToxrqWB.png
http://image.xxx.com/images/page/md-1590570852.107605-889.jpg
http://image.woshipm.com/wp-files/2018/10/xKniC0Sybv44MwdIXvuG.png
http://image.xxx.com/images/page/md-1590570852.325012-272.jpg
http://image.woshipm.com/wp-files/2018/10/HESVAPuCq1ez4BRIg02j.png
http://image.xxx.com/images/page/md-1590570852.5032153-4.jpg
http://image.woshipm.com/wp-files/2018/10/wuUJxewCKVr67FyrF8mh.png
http://image.xxx.com/images/page/md-1590570852.8807476-115.jpg
http://image.woshipm.com/wp-files/2018/10/q3LKfWhN7FnGb6EqfP1J.png
http://image.xxx.com/images/page/md-1590570853.2011533-246.jpg
http://image.woshipm.com/wp-files/2018/10/0aGFqgN9hdvzzsO3lQmg.png
http://image.xxx.com/images/page/md-1590570853.3879051-240.jpg
http://image.woshipm.com/wp-files/2018/10/XG1xJnyDcs1myYDzq0Tw.png
http://image.xxx.com/images/page/md-1590570853.5631113-689.jpg
http://image.woshipm.com/wp-files/2018/10/1nmcaja5LrdtnkTBLuuR.png
http://image.xxx.com/images/page/md-1590570853.7774086-598.jpg
http://image.woshipm.com/wp-files/2018/10/NuVLA1PAttQSzmMY8FZX.png
http://image.xxx.com/images/page/md-1590570853.9684172-330.jpg
http://image.woshipm.com/wp-files/2018/10/5rnwBLW5AkyIK33oS2Rl.png
http://image.xxx.com/images/page/md-1590570854.1651154-755.jpg

I:\src\hugo\9ong\产品设计\分销体系设计.md
===========================