图像处理(PIL & OpenCV)

Pillow and OpenCV use different formats of images. So you can't just read an image in Pillow and use it manipulate the image in OpenCV. So you need a converter to convert from one format to another.

#To convert from PIL image to OpenCV use:
import cv2
import numpy as np
from PIL import Image"demo2.jpg") # open image using PIL

# use numpy to convert the pil_image into a numpy array

# convert to a openCV2 image, notice the COLOR_RGB2BGR which means that 
# the color is converted from RGB to BGR format
opencv_image=cv2.cvtColor(numpy_image, cv2.COLOR_RGB2BGR) 

#To convert from OpenCV image to PIL image use:
import cv2
import numpy as np
from PIL import Image

opencv_image=cv2.imread("demo2.jpg") # open image using openCV2

# convert from openCV2 to PIL. Notice the COLOR_BGR2RGB which means that 
# the color is converted from BGR to RGB
pil_image=Image.fromarray(cv2.cvtColor(opencv_image, cv2.COLOR_BGR2RGB) )

Convert between PIL image and NumPy ndarray

image =“ponzo.jpg”) # image is a PIL image 
array = numpy.array(image) # array is a numpy array 
image2 = Image.fromarray(array) # image2 is a PIL image 

Convert between OpenCV image and NumPy ndarray

cimg = cv.LoadImage("ponzo.jpg", cv.CV_LOAD_IMAGE_COLOR) # cimg is a OpenCV image 
pimg = Image.fromstring("RGB", cv.GetSize(cimg), cimg.tostring()) # pimg is a PIL image 
array = numpy.array(pimg) # array is a numpy array 
pimg2 = cv.fromarray(array) # pimg2 is a OpenCV image 


Pandas提供了IO工具可以将大文件分块读取,再调用 pandas.concat 连接DataFrame,chunkSize设置在1000万条左右速度优化比较明显。

loop = True
chunkSize = 100000
chunks = []
while loop:
        chunk = reader.get_chunk(chunkSize)
    except StopIteration:
        loop = False
        print "Iteration is stopped."
df = pd.concat(chunks, ignore_index=True)