My technical memo: Bash

Showing posts with label Bash. Show all posts

Tuesday, December 6, 2011

Convert inkscape svg images to eps perserving latex formulas

1. Save Inkscape svg images to pdf+latex
2. Convert to eps using the following BASH script, which contains the following steps: (1) use pdflatex to generate pdf (2) use pdfcrop to crop the pdf (3) use pdftops to convert to eps.

This the header.tex:

This is the tail.tex:

The tool pdfcrop.pl can be downloaded from http://www.ctan.org/tex-archive/support/pdfcrop.

Wednesday, August 31, 2011

How to merge several pdf files for printing

I have 20 pdf files to print. Is there a way to print all of them once, resulting in the same printouts as if I have printed them separately? I could use pdftk to combine the pdf files into one single pdf file, but some files have odd number of pages, which results in pages from two files printing on the same sheet of paper.

The solution is to pad an empty page to those files with odd number of pages, and then use pdftk to combine the pdf files. Here is the script (the file "emptypage.pdf" is just an empty page, and can be created using LaTeX).

Monday, February 21, 2011

如何把开放文学网上的小说制作成chm格式的电子书

开放文学网是台湾的一个很好的网站，上面有很多中国古典文学名著，而且大都经过校对，质量较高。我最近买了iphone上的CHMate软件，阅读chm格式的电子书非常方便。我想用它来读开放文学网上的那些电子书。可是开放文学网上的电子书都是打包成zip格式的html文件，该如何才能制作成chm格式电子书呢？经过试验，我发现可以通过以下步骤来完成。

将zip文件解开后，使用如下脚本处理。

#!/bin/bash

# convert file encoding from big5 to utf8
for file in `ls *.htm`; do
iconv -f BIG5 -t UTF8 $file -o tmp.htm
mv -f tmp.htm $file
done

# fix the titles
for file in `ls *.htm`; do
num=${file%.*}
#test whether num is a number
if [ $num -eq $num 2>/dev/null ]; then
grep $file index.htm > tmp.htm
title1=`head -1 tmp.htm | sed 's/[a-zA-Z0-9 "\<\>=\&\/;.\%\^]//g'`
title2=`tail -1 tmp.htm | sed 's/[a-zA-Z0-9 "\<\>=\&\/;.\%\^]//g'`
title="${title1}--${title2}"
perl -pe "s/(\)([^[:ascii:]]*--[^[:ascii:]]*--[^[:ascii:]]*)(\<\/title\>)/\1${title}\3/g;" $file > tmp.htm
mv -f tmp.htm $file
fi
done

# pad the file names
for file in `ls *.htm`; do
num=${file%.*}
#test whether num is a number
if [ $num -eq $num 2>/dev/null ]; then
nfile=`printf "%03d.htm\n" $num`
if [ $file != $nfile ]; then
mv -f $file $nfile
fi
fi
done

# fix the links inside htm
sed -i "s/[0−9][0−9][0−9].htm/\10\2/g" *.htm
sed -i "s/[0−9][0−9].htm/\100\2/g" *.htm

# convert from utf8 to gbk
for file in `ls *.htm`; do
iconv -f UTF8 -t GBK $file -o tmp.htm
mv -f tmp.htm $file
done

sed -i 's/big5/gbk/g' *.htm

“CHM制作精灵"是一款免费的chm制作软件。这一步使用该软件生成chm文件。选择包含那些htm文件的目录，然后选择编译。编译chm之前记得修改书名。编译之后文件保存在htm文档的同一目录，文件名是"CHM 帮助.chm"。

Sunday, December 5, 2010

How to use sed to extract lines from a text file

In a text file, every 1090 lines corresponds to solution at one time step. I want to extract one solution every 100 time step, what should I do?

This can be done easily using the Linux tool "sed".

First, we extract the solutions and save it into one single file:

sed -n 1~109000,+1089p test.txt > out.txt

Here "1" means the starting line of the text file, "~109000" means every 109000 lines, "+1089" means print that line and the following 1089 lines.

Next, we can split the solutions in out.txt into seperate text files using the Linux tool "split".

split -l 1090 -a 6 -d out.txt new

Here "-l 1090" means every 1090 lines save as a new file, "-a 6" means to use 6 digits in the numbering of the file names, "-d" means to use digits instead of letters in the numbering, the final "new" means the files names will start with "new".

For more options, you can use "man sed" and "man split" to look up the manual.

How to place side by side graphs of R and paraview

First create graph from R:

pdf("R0001.pdf",width=7,height=8.48)

plot(test[1:1432,2],test[1:1432,3]/pi,xlim=c(-15,30),ylim=c(0,1),xlab="W0",ylab="theta",'l',col="blue")

points(test[1000,2],test[1000,3]/pi,pch=20)

dev.off()

Second create graph from Paraview using left-right split, and save the file in "P0001.pdf".

Third, combine the two graphs

pdftk R0001.pdf P0001.pdf cat output C0001.pdf

java -cp ~/local/Multivalent20060102.jar tool.pdf.Impose -dim 1x2 -verbose -layout "1,2" C0001.pdf

pdfcrop C0001-up.pdf

You will get C0001-up-crop.pdf as the final result.

Saturday, December 4, 2010

How to combine several pdf files into one single pdf file

In Linux, we can use the following command to merge several pdf files into a single pdf file:

gs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=combinedpdf.pdf -dBATCH *.pdf

Another way is to use the handy tool called "pdftk":

pdftk *.pdf cat output combinedpdf.pdf

My technical memo

Category