Archive for the ‘programming’ Category

Find files before/after/in between specific dates

February 14, 2011

Hello folks. \\//,

Well here I am, reading a technical research paper on membrane computing and Petri nets when I decided to download some of the reference materials, conference/journal technical papers, of the paper I’m currently reading.  So I download the reference materials from IEEE Xplore, plus other non-reference materials but still related to the topic I’m reading.

Eventually, I download around 10 PDF files, with filenames made up of purely integers and of course the .pdf extension. Problem is, I’m too lazy (not that much really, I just don’t want to keep doing this manually over and over) to specify each PDF filename, then add them to a compressed archive (this case a RAR file) and then send them to another machine or person or account of mine for archiving.  Essentially, I just want that: to get those PDF files I recently downloaded for this specific topic I’m currently in, not include other PDF files in the same directory, and then compress them into a single RAR file.

How to go over doing this?

Luckily there are for loops and the nifty find command in *nixes like my Ubuntu box. 🙂 So what I simply do is

for i in `find . -type f -cnewer startdate ! -cnewer enddate`; do rar a myrarfile.rar $i; done

which means given 2 reference points, files startdate and enddate, I loop over all files in between the modification times of these two files and then add them to the RAR file myrarfile.rar.

Presto. Problem solved, learned something new. Now, back to reading research papers. 🙂

\\//,

Reference/s:

http://unixhelp.ed.ac.uk/CGI/man-cgi?find

SVN over HTTP

January 31, 2011

Hello there true believers, been a while since I’ve written something here.


A quick FYI on those wanting to use SVN but are restricted to use it from within a network with a proxy.
If you try using SVN over HTTP, it most likely won’t work by default since SVN uses a different network port (usually 3690 for *nixes) and HTTP proxies by default use 80 or 8080 as their ports.
If you try checking out or updating or committing to/from an SVN repo by default, you’ll run into a problem saying SVN couldn’t connect. Uh oh spaghetti-oh. 🙂
In my Ubuntu 10.04 machine, what I do is open (with sudo/root properties) the file:

/etc/subversion/servers

and uncomment (or add, if the following are not present) these parameters:

http-proxy-host = yourproxy.com
http-proxy-port = port

e.g.

http-proxy-host = 10.10.10.10
http-proxy-port = 8080

I save the file, then try SVN again. Voila. I’m back to coding again. 🙂

Reference/s:

Fixing the PHP error/warning “Cannot modify header information – headers already sent “

August 18, 2009

This will be a quick post/note-to-self since I’m pretty occupied. Actually, the title of this post should have been “How to bloody fix the deceptively easy but hard to find confounding error in PHP headers: “warning Cannot modify header information – headers already sent”. But that’s too bloody long (though it would be interesting to find out in the future how WordPress concatenates long URLs…). The reason why I call it deceptive will be clarified at the 3rd entry below 🙂

The 2 primary parts of an HTTP request response are the headers and the body, which should be sent separately. Now in PHP sometimes some programmers, not just novice ones but long time ones (ehem…like me…) forget that we’re modifying them programmatically, which can sometimes cause errors. The header must always be sent first before the body, wherein both are coming from the web server. This is highlighted in this Wikipedia HTTP request example. For example, the php function header() can modify some of the (obviously) header parameters, most/all of which are listed in this Wikipedia list of HTTP headers. The above PHP error occurs because the body (or part of it) has already been sent by the server to the client, afterwhich a change of header values follows, either from the client or server.

Now, to finally fix the deceptively easy to fix but hard to find source of the error

Warning: Cannot modify header information – headers already sent by (<some of your PHP source files should be listed here>)

You can check the following:

1) If you’re using the header() function or some PHP function that modifies the header or controls the flow of action of your pages (e.g. from one page to another), you should inspect those. Usually it’s better to use conditional statements (e.g. IF or ELSE) to isolate the execution of one part of your code from another. It is quite likely that the error comes before or at the line of this function.

2) Make sure you don’t output/print/echo anything to the client (body) before sending/changing the headers. Again, conditional statements are useful here.

3) Finally, and the easiest to overlook, is to remove any white space outside the PHP start and end tags (<?php ?>). This is quite often the easiest thing to miss (for me at the least). The reason for this deceptive white space causing an error in PHP is that the white space is still interpreted as an echo statement printing a blank line, which interrupts the format of the HTTP header (see Wikipedia header format example above).

Of course the disclaimer here is that you are to most likely encounter this error if you’re more or less building your PHP application from the ground up, or without using web frameworks. It’s more unlikely and unusual to receive this error while you’re using an MVC based PHP framework.

warning Cannot modify header information – headers already sent But

Sci-fi ponderings when (almost) idle

June 22, 2009

I was talking to one of my housemates last night while I was taking a break from using my PC when we came to the topic of Terminators.  Being sci-fi fans (me especially) we talked about the realism of robots and computer software taking over the world.

The proposed problems pertaining to plausibilities

Now, my housemate proposed that Skynet, the autonomous computer program that took over the world and is responsible for the decline and domination of the human race, isn’t very plausible, or at least isn’t ‘too’ smart. His propositions are the following:

a) Skynet should have made the Terminators smarter so as to make them more adaptable to human circumstances, issues, and environment.

He mentions that though they look like humans, they (or at least the ones in the movies, unlike in the Terminator TV series) they still act relatively cold and robot-like. Adapting to human behavior, emotions, idiosyncrasies, at least temporarily, may help them perform their missions better, i.e. terminating their targets.

b) Skynet, at least in the future (so my housemate concedes), should have made itself connected/linked to Terminators so that it can use it’s powerful processors and information on humans, including their tactics, to finally wipe out the human race and leave nothing to stand up against it.

I made my housemate actually concede early in this proposition that this can only work in the future, because how would Skynet of the future control and communicate with Terminators it has sent to the past? It could be quite given that future Skynet would be linked to the Terminator pawns via some wireless technology, but wireless technology across time? Dubious.

The proposed answers to the problems

a) Now this one has been answered already in Terminator 2: Judgement Day, when John Connor asks (approximately at 1 hour 6 minutes of the Special Edition of the movie) if Terminators (or at least the T-800 model 101 Terminator a.k.a. Arnold model/line of terminators) can learn new things so they can be more human. The terminator responds by saying that Skynet “presets the switch to ‘read-only’ when terminators are sent out alone”, to prevent them from “thinking too much”. This then prevents terminators by default, or at least the movie terminators such as the T-800, T-850 T-1000, and T-X, from learning a lot of things about what makes humans humans.

b) One solution I’ve thought about for this specific conundrum in the Terminator universe, which could also be said in real life hardware/software,  is that if Skynet ‘hooks’ itself up to every terminator walking around trying to find, infiltrate, and terminate humans, i.e. connect its thinking to the terminators, then that would lead to a vulnerability. The vulnerability comes from the fact that by doing so (hooking up/connecting to terminators in the field) would allow humans to insert a virus or ‘anti-Sknet’ software to one or more captured terminators, which could then be uploaded to the main Skynet program and destroy Skynet entirely. This is possible because Skynet has to maintain a duplex connection to the terminators in the field if Skynet is to control them and still be in sync with the main Skynet program. I think this is a risk Skynet would not dare take.

Questions/comments/arguments? Feel free to post them as long as they’re calm, ruly. 🙂

Pretty Practicable PDF Tricks In Linux

March 23, 2009

I still don’t have quite a lot of time to write a more or less decent technology or philosophy or science/math related post, but I just want to put this on my blog for the sake of reference  again (as most, if not all, of my blog entries).

My Dilemma

I have a copy of a pdf file from which I want to share some parts only to my lab exercise partner (for reasons I can’t exactly divulge in the public Internet). So I Google around how to manipulate, specifically to  pluck/extract specific pages from a pdf file, and still output the extracted files as pdf file/s themselves. Then I found pdftk. Fantastic tool. Really.

Why Is It Fantastic?

Here are a few reasons why:

For such a small (more or less) package (3408kB in my Ubuntu 8.10 installation) you can:

Pdftk can join and split PDFs; pull single pages from a file; encrypt and decrypt PDF files; add, update, and export a PDF’s metadata; export bookmarks to a text file; add or remove attachments to a PDF; fix a damaged PDF; and fill out PDF forms. In short, there’s very little pdftk can’t do when it comes to working with PDFs.

Also,

Developer Sid Steward describes pdftk as the PDF equivalent of an “electronic staple remover, hole punch, binder, secret decoder ring, and X-ray glasses.”  Pdftk can join and split PDFs; pull single pages from a file; encrypt and decrypt PDF files; add, update, and export a PDF’s metadata; export bookmarks to a text file; add or remove attachments to a PDF; fix a damaged PDF; and fill out PDF forms.

Swiss army knife of PDF files anyone? And thankfully, it’s free and open source. The above quotes are from linux.com, and a lot of us know that once something gets posted on linux.com, it’s more or less worthwhile to learn, more so to read at the very least. pdftk is a command line tool (sorry, but check out my further references below).

And installing it is just simply

sudo apt-get install pdftk

in my Ubuntu 8.04 and 8.10 installations. Again, quoting from linux.com, here are some very useful (at least to me) things you can do with pdftk. Of course, with a bit of knowledge in scripting or programming (bash, php, python etc) you can work wonders with this tool:

Joining files

Pdftk’s ability to join two or more PDF files is on par with such specialized applications as pdfmeld and joinPDF (discussed in this article). The command syntax is simple:

pdftk file1.pdf file2.pdf cat output newFile.pdf

cat is short for concatenate — that is, link together, for those of us who speak plain English — and output tells pdftk to write the combined PDFs to a new file.

Pdftk doesn’t retain bookmarks, but it does keep hyperlinks to both destinations within the PDF and to external files or Web sites. Where some other applications point to the wrong destinations for hyperlinks, the links in PDFs combined using pdftk managed to hit each link target perfectly.

Splitting files

Splitting PDF files with pdftk was an interesting experience. The burst option breaks a PDF into multiple files — one file for each page:

pdftk user_guide.pdf burst

I don’t see the use of doing that, and with larger documents you wind up with a lot of files with names corresponding to their page numbers, like pg_0001 and pg_0013 — not very intuitive.

On the other hand, I found pdftk’s ability to remove specific pages from a PDF file to be useful. For example, to remove pages 10 to 25 from a PDF file, you’d type the following command:

pdftk myDocument.pdf cat 1-9 26-end output removedPages.pdf

Updated Man page

For all the geeks and geekettes out there (no this sub heading is not sexist), here’s an updated man page from my Ubuntu 8.10 server installation:

PDFTK(1)                                                                                                                          PDFTK(1)

NAME
pdftk – A handy tool for manipulating PDF

SYNOPSIS
pdftk <input PDF files | – | PROMPT>
[input_pw <input PDF owner passwords | PROMPT>]
[<operation> <operation arguments>]
[output <output filename | – | PROMPT>]
[encrypt_40bit | encrypt_128bit]
[allow <permissions>]
[owner_pw <owner password | PROMPT>]
[user_pw <user password | PROMPT>]
[flatten] [compress | uncompress]
[keep_first_id | keep_final_id] [drop_xfa]
[verbose] [dont_ask | do_ask]
Where:
<operation> may be empty, or:
[cat | attach_files | unpack_files | burst |
fill_form | background | stamp | generate_fdf
dump_data | dump_data_fields | update_info]

For Complete Help: pdftk –help

DESCRIPTION
If PDF is electronic paper, then pdftk is an electronic staple-remover, hole-punch, binder, secret-decoder-ring, and X-Ray-glasses.
Pdftk is a simple tool for doing everyday things with PDF documents.  Use it to:

* Merge PDF Documents
* Split PDF Pages into a New Document
* Rotate PDF Documents or Pages
* Decrypt Input as Necessary (Password Required)
* Encrypt Output as Desired
* Fill PDF Forms with X/FDF Data and/or Flatten Forms
* Generate FDF Data Stencil from PDF Forms
* Apply a Background Watermark or a Foreground Stamp
* Report PDF Metrics such as Metadata and Bookmarks
* Update PDF Metadata
* Attach Files to PDF Pages or the PDF Document
* Unpack PDF Attachments
* Burst a PDF Document into Single Pages
* Uncompress and Re-Compress Page Streams
* Repair Corrupted PDF (Where Possible)

OPTIONS
A summary of options is included below.

–help, -h
Show summary of options.

<input PDF files | – | PROMPT>
A list of the input PDF files. If you plan to combine these PDFs (without using handles) then list files in  the  order  you
want  them  combined.  Use – to pass a single PDF into pdftk via stdin.  Input files can be associated with handles, where a
handle is a single, upper-case letter:

<input PDF handle>=<input PDF filename>

Handles are often omitted.  They are useful when specifying PDF passwords or page ranges, later.

For example: A=input1.pdf B=input2.pdf

[input_pw <input PDF owner passwords | PROMPT>]
Input PDF owner passwords, if necessary, are associated with files by using their handles:

<input PDF handle>=<input PDF file owner password>

If handles are not given, then passwords are associated with input files by order.

Most pdftk features require that encrypted input PDF are accompanied by the ~owner~ password. If the input PDF has no  owner
password,  then  the  user  password  must be given, instead.  If the input PDF has no passwords, then no password should be
given.

When running in do_ask mode, pdftk will prompt you for a password if the supplied password is incorrect or none was given.

[<operation> <operation arguments>]
If this optional argument is omitted, then pdftk runs in ’filter’ mode.  Filter mode takes only one PDF input and creates  a
new PDF after applying all of the output options, like encryption and compression.

Available operations are: cat, attach_files, unpack_files, burst, fill_form, background, stamp, dump_data, dump_data_fields,
generate_fdf, update_info. Some operations takes additional arguments, described below.

cat [<page ranges>]
Catenates pages from input PDFs to create a new PDF.  Page order in the new PDF is specified by the order  of  the  given
page ranges.  Page ranges are described like this:

<input PDF handle>[<begin page number>[-<end page number>[<qualifier>]]][<page rotation>]

Where  the  handle  identifies one of the input PDF files, and the beginning and ending page numbers are one-based refer‐
ences to pages in the PDF file, and the qualifier can be even or odd, and the page rotation can be N, S, E, W, L,  R,  or
D.

If the handle is omitted from the page range, then the pages are taken from the first input PDF.

The  even  qualifier  causes  pdftk  to  use only the even-numbered PDF pages, so 1-6even yields pages 2, 4 and 6 in that
order.  6-1even yields pages 6, 4 and 2 in that order.

The odd qualifier works similarly to the even.

The page rotation setting can cause pdftk to rotate pages and documents.  Each option sets the page rotation  as  follows
(in  degrees):  N:  0,  E: 90, S: 180, W: 270, L: -90, R: +90, D: +180. L, R, and D make relative adjustments to a page’s
rotation.

If no arguments are passed to cat, then pdftk combines all input PDFs in the order they were given to create the  output.

NOTES:
* <end page number> may be less than <begin page number>.
* The keyword end may be used to reference the final page of a document instead of a page number.
* Reference a single page by omitting the ending page number.
* The handle may be used alone to represent the entire PDF document, e.g., B1-end is the same as B.

Page Range Examples w/o Handles:
1-endE – rotate entire document 90 degrees
5 11 20
5-25oddW – take odd pages in range, rotate 90 degrees
6-1

Page Range Examples Using Handles:
Say A=in1.pdf B=in2.pdf, then:
A1-21
Bend-1odd
A72
A1-21 Beven A72
AW – rotate entire document 90 degrees
B
A2-30evenL – take the even pages from the range, remove 90 degrees from each page’s rotation
A A
AevenW AoddE
AW BW BD

attach_files <attachment filenames | PROMPT> [to_page <page number | PROMPT>]
Packs  arbitrary  files  into  a  PDF  using PDF’s file attachment features. More than one attachment may be listed after
attach_files. Attachments are added at the document level unless the optional to_page option is given, in which case  the
files are attached to the given page number (the first page is 1, the final page is end). For example:

pdftk in.pdf attach_files table1.html table2.html to_page 6 output out.pdf

unpack_files
Copies  all  of  the attachments from the input PDF into the current folder or to an output directory given after output.
For example:

pdftk report.pdf unpack_files output ~/atts/

or, interactively:

pdftk report.pdf unpack_files output PROMPT

burst  Splits a single, input PDF document into individual pages. Also creates a report named doc_data.txt which is the same  as
the  output  from dump_data.  If the output section is omitted, then PDF pages are named: pg_%04d.pdf, e.g.: pg_0001.pdf,
pg_0002.pdf, etc.  To name these pages yourself, supply a printf-styled format string via the output section.  For  exam‐
ple,  if  you  want  pages  named: page_01.pdf, page_02.pdf, etc., pass output page_%02d.pdf to pdftk.  Encryption can be
applied to the output by appending output options such as owner_pw, e.g.:

pdftk in.pdf burst owner_pw foopass

fill_form <FDF data filename | XFDF data filename | – | PROMPT>
Fills the single input PDF’s form fields with the data from an FDF file, XFDF file or  stdin.  Enter  the  data  filename
after fill_form, or use – to pass the data via stdin, like so:

pdftk form.pdf fill_form data.fdf output form.filled.pdf

After  filling  a  form, the form fields remain interactive unless you also use the flatten output option. flatten merges
the form fields with the PDF pages. You can use flatten alone, too, but only on a single PDF:

pdftk form.pdf fill_form data.fdf output out.pdf flatten

or:

pdftk form.filled.pdf output out.pdf flatten

If the input FDF file includes Rich Text formatted data in addition to plain text, then the Rich Text data is packed into
the form fields as well as the plain text.  Pdftk also sets a flag that cues Acrobat/Reader to generate new field appear‐
ances based on the Rich Text data.  That way, when the user opens the PDF, the viewer will create the Rich Text fields on
the  spot.   If the user’s PDF viewer does not support Rich Text, then the user will see the plain text data instead.  If
you flatten this form before Acrobat has a chance to create (and save) new field appearances, then the plain  text  field
data is what you’ll see.

background <background PDF filename | – | PROMPT>
Applies  a  PDF  watermark  to the background of a single input PDF.  Pass the background PDF’s filename after background
like so:

pdftk in.pdf background back.pdf output out.pdf

Pdftk uses only the first page from the background PDF and applies it to every page of  the  input  PDF.   This  page  is
scaled and rotated as needed to fit the input page.  You can use – to pass a background PDF into pdftk via stdin.

If  the input PDF does not have a transparent background (such as a PDF created from page scans) then the resulting back‐
ground won’t be visible — use the stamp feature instead.

stamp <stamp PDF filename | – | PROMPT>
This behaves just like the background feature except it overlays the stamp PDF page on top of the  input  PDF  document’s
pages.  This works best if the stamp PDF page has a transparent background.

dump_data
Reads  a  single, input PDF file and reports various statistics, metadata, bookmarks (a/k/a outlines), and page labels to
the given output filename or (if no output is given) to stdout.  Does not create a new PDF.

dump_data_fields
Reads a single, input PDF file and reports form field statistics to the given output filename or (if no output is  given)
to stdout.  Does not create a new PDF.

generate_fdf
Reads  a single, input PDF file and generates a FDF file suitable for fill_form out of it to the given output filename or
(if no output is given) to stdout.  Does not create a new PDF.

update_info <info data filename | – | PROMPT>
Changes the metadata stored in a single PDF’s Info dictionary to match the input data file. The input data file uses  the
same  syntax  as  the  output from dump_data. This does not change the metadata stored in the PDF’s XMP stream, if it has
one. For example:

pdftk in.pdf update_info in.info output out.pdf

[output <output filename | – | PROMPT>]
The output PDF filename may not be set to the name of an input filename.  Use  –  to  output  to  stdout.   When  using  the
dump_data  operation,  use output to set the name of the output data file. When using the unpack_files operation, use output
to set the name of an output directory.  When using the burst operation, you can use output to  control  the  resulting  PDF
page filenames (described above).

[encrypt_40bit | encrypt_128bit]
If an output PDF user or owner password is given, output PDF encryption strength defaults to 128 bits.  This can be overrid‐
den by specifying encrypt_40bit.

[allow <permissions>]
Permissions are applied to the output PDF only if an encryption strength is specified or an owner or user password is given.
If permissions are not specified, they default to ’none,’ which means all of the following features are disabled.

The permissions section may include one or more of the following features:

Printing
Top Quality Printing

DegradedPrinting
Lower Quality Printing

ModifyContents
Also allows Assembly

Assembly

CopyContents
Also allows ScreenReaders

ScreenReaders

ModifyAnnotations
Also allows FillIn

FillIn

AllFeatures
Allows the user to perform all of the above, and top quality printing.

[owner_pw <owner password | PROMPT>]

[user_pw <user password | PROMPT>]
If  an  encryption  strength  is  given but no passwords are supplied, then the owner and user passwords remain empty, which
means that the resulting PDF may be opened and its security parameters altered by anybody.

[compress | uncompress]
These are only useful when you want to edit PDF code in a text editor like vim or emacs.  Remove PDF page stream compression
by applying the uncompress filter. Use the compress filter to restore compression.

[flatten]
Use  this  option  to merge an input PDF’s interactive form fields (and their data) with the PDF’s pages. Only one input PDF
may be given. Sometimes used with the fill_form operation.

[keep_first_id | keep_final_id]
When combining pages from multiple PDFs, use one of these options to copy the document ID from either  the  first  or  final
input  document  into the new output PDF. Otherwise pdftk creates a new document ID for the output PDF. When no operation is
given, pdftk always uses the ID from the (single) input PDF.

[drop_xfa]
If your input PDF is a form created using Acrobat 7 or Adobe Designer, then it probably has XFA data.  Filling such  a  form
using  pdftk  yields  a PDF with data that fails to display in Acrobat 7 (and 6?).  The workaround solution is to remove the
form’s XFA data, either before you fill the form using pdftk or at the time you fill the  form.  Using  this  option  causes
pdftk to omit the XFA data from the output PDF form.

This  option  is  only  useful  when  running pdftk on a single input PDF.  When assembling a PDF from multiple inputs using
pdftk, any XFA data in the input is automatically omitted.

[verbose]
By default, pdftk runs quietly. Append verbose to the end and it will speak up.

[dont_ask | do_ask]
Depending on the compile-time settings (see ASK_ABOUT_WARNINGS), pdftk might prompt you for further input when it encounters
a  problem, such as a bad password. Override this default behavior by adding dont_ask (so pdftk won’t ask you what to do) or
do_ask (so pdftk will ask you what to do).

When running in dont_ask mode, pdftk will over-write files with its output without notice.

EXAMPLES
Decrypt a PDF
pdftk secured.pdf input_pw foopass output unsecured.pdf

Encrypt a PDF using 128-bit strength (the default), withhold all permissions (the default)
pdftk 1.pdf output 1.128.pdf owner_pw foopass

Same as above, except password ’baz’ must also be used to open output PDF
pdftk 1.pdf output 1.128.pdf owner_pw foo user_pw baz

Same as above, except printing is allowed (once the PDF is open)
pdftk 1.pdf output 1.128.pdf owner_pw foo user_pw baz allow printing

Join in1.pdf and in2.pdf into a new PDF, out1.pdf
pdftk in1.pdf in2.pdf cat output out1.pdf
or (using handles):
pdftk A=in1.pdf B=in2.pdf cat A B output out1.pdf
or (using wildcards):
pdftk *.pdf cat output combined.pdf

Remove ’page 13’ from in1.pdf to create out1.pdf
pdftk in.pdf cat 1-12 14-end output out1.pdf
or:
pdftk A=in1.pdf cat A1-12 A14-end output out1.pdf

Apply 40-bit encryption to output, revoking all permissions (the default). Set the owner PW to ’foopass’.
pdftk 1.pdf 2.pdf cat output 3.pdf encrypt_40bit owner_pw foopass

Join two files, one of which requires the password ’foopass’. The output is not encrypted.
pdftk A=secured.pdf 2.pdf input_pw A=foopass cat output 3.pdf

Uncompress PDF page streams for editing the PDF in a text editor (e.g., vim, emacs)
pdftk doc.pdf output doc.unc.pdf uncompress

Repair a PDF’s corrupted XREF table and stream lengths, if possible
pdftk broken.pdf output fixed.pdf

Burst a single PDF document into pages and dump its data to doc_data.txt
pdftk in.pdf burst

Burst a single PDF document into encrypted pages. Allow low-quality printing
pdftk in.pdf burst owner_pw foopass allow DegradedPrinting

Write a report on PDF document metadata and bookmarks to report.txt
pdftk in.pdf dump_data output report.txt

Rotate the first PDF page to 90 degrees clockwise
pdftk in.pdf cat 1E 2-end output out.pdf

Rotate an entire PDF document to 180 degrees
pdftk in.pdf cat 1-endS output out.pdf

NOTES
pdftk uses a slightly modified iText Java library (http://itextpdf.sourceforge.net/) to read and write  PDF.  The  author  compiled
this Java library using GCJ (http://gcc.gnu.org) so it could be linked with a front end written in C++.

The pdftk home page is http://www.accesspdf.com/pdftk/.

AUTHOR
Sid Steward (ssteward@accesspdf.com) maintains pdftk.

September 18, 2006                                                    PDFTK(1)

Comments, questions, and suggestions are always welcome as long as they’re calm and ruly 🙂

Further References

>>linux.com reference article

>>Main site for pdftk (including manual/documentation), a bit dated though

>> GUI for pdftk

Shakespeare and programming

December 10, 2008

A great playwright and poet once wrote in his play Hamlet (and that great poet and playwright of course is none other than Shakespeare)  the following question from act three, scene one:

To be or not to be, that is the question;

Putting it into a more geeky format I have the following translation:

2B OR NOT 2B

Which sort of turns the question into a logical statement. Tidying it up a little further and noting the unary and binary operators in the statement, as well as the operator precedence, and further clarifying its (geeky) nature I have:

0x2B OR (NOT 0x2B)

And so I arrive at an answer to the question in Hamlet’s soliloquy:

0x2B OR (NOT 0x2B) = 0xFF

The answer turns out to be pretty simple and not so philosophical and deep! 😀 If you don’t know why my answer to the famous question is 0xFF, keep on guessing! 😀

(I’m feeling geekier than usual tonight, so there you go)

svnserve the people

June 6, 2008

If you consider yourself a programmer, especially one that works on a number of codes for a single project or on multiple projects which you update from time to time, then you must definitely know and use Subversion (svn) or something like it. Subversion is well-known in the open source community and is used on many open source projects such as: Apache Software Foundation, KDE, GNOME, Free Pascal, GCC, Python, Ruby, and Samba to name a few. That being said, what in the world are you waiting for? Go on using svn and get going/coding.

This post is about how to quickly use svn to get copies of codes or files from any svn repository on the web or even on your own LAN. The post also tackles how to quickly set up your own svn server or repository so you can start accessing your codes in an organized manner anywhere, as long as you have network connection. This post is by no means exhaustive and is only meant for quick do-it-yourself svn server/client setup. Resources at the end of this post are highly recommended for further or in-depth reading.

As a client

Client for svn means you are just getting files or uploading them from an already established svn repository/server. First, the most helpful svn command that one needs to learn is this line:

$svn help

which should give you this output or something very close to it:

usage: svn <subcommand> [options] [args]
Subversion command-line client, version 1.4.6.
Type ‘svn help <subcommand>’ for help on a specific subcommand.
Type ‘svn –version’ to see the program version and RA modules
or ‘svn –version –quiet’ to see just the version number.

Most subcommands take file and/or directory arguments, recursing
on the directories. If no arguments are supplied to such a
command, it recurses on the current directory (inclusive) by default.

Available subcommands:
add
blame (praise, annotate, ann)
cat
checkout (co)
cleanup
commit (ci)
copy (cp)
delete (del, remove, rm)
diff (di)
export
help (?, h)
import
info
list (ls)
lock
log
merge
mkdir
move (mv, rename, ren)
propdel (pdel, pd)
propedit (pedit, pe)
propget (pget, pg)
proplist (plist, pl)
propset (pset, ps)
resolved
revert
status (stat, st)
switch (sw)
unlock
update (up)

Subversion is a tool for version control.
For additional information, see http://subversion.tigris.org/

Now, in order to get a copy of the code (or part of the code if you wish) in an existing svn repository (e.g. the svn for GCC) you simply enter the following (assuming of course you have svn installed, which shouldn’t be a problem in most operating systems, especially Linux distributions, nowadays):

$svn checkout svn://gcc.gnu.org/svn/gcc/trunk gcc

The previous command gets the code from the GCC trunk (the main branch/working version of a program/project) and copies it to your local storage space (hard disk etc.) under the directory gcc. From the output of svn help shown above, the option checkout or co can be used interchangeably, among others. Now if you have a working copy from an svn repository and there are updates from the repository, all you have to do to update your local copy is to go into the directory where you saved your svn copy of the codes from the repository (in this case, gcc) and enter:

$svn update

And for each updated item/code from the repository (repo) a line will start with a character reporting the action taken. These characters have the following meaning:

A – Added
D – Deleted
U – Updated
C – Conflict
G – Merged

Also, to get more information on a specific svn option such as update or checkout, you can use

$svn help command

where you substitute command for the command you want more info to. If you have your own repo or if you’re say a programmer and you have write access to an svn repo so you can upload your updated part of the code, you can simply go to the directory of your svn copy and enter:

$cp <code/s to copy> <directory where my svn copy is>

then to add the code/s to the version control management of svn:

$svn add <code/s copied>

and finally,

$svn commit -m “your message here for this specific commit”

Which commits or “uploads” your code to the svn repo, so that your changes will be reflected in the svn repo and others will get their copies of the changes you’ve made. You must have a commit message in between the two marks, since it’s required. You can also omit the -m and the message, but svn will open for you your default text editor so you can enter your commit message there. I personally prefer the second one since I can view the files I edited before committing them. Other useful svn commands where you can get useful information (and are worth looking into) include status, info, diff, and list, which can also be used if you are handling the svn server.

To view the svn logs you or others committed, just enter the command

$svn log

In case you make a mistake after committing your log, and you want to edit that log, the command

$svn propedit svn:log –revprop -rX

will do it for you. X in this case is the revision/log number that you want to edit. However, before a client can go on changing/editing logs, the svn server/administrator should allow the revision change hook (specifically the pre-revprop-change). This will be discussed in the server section below.

Lastly, to change your system’s default svn text editor (which is usually vi or emacs), enter the following:

export SVN_EDITOR=/path/to/preferred/editor

and then change the path accordingly.

As a server

To quickly setup an svn server for LAN/Internet access, enter the following:

$svnadmin create myrepos

And change myrepos to your desired project directory name. Next, we create the typical svn directory layout (typical since there’s no rule from stopping you to change your directory layout, it’s just that the layout I’ll be mentioning is the usual layout convention widely used by developers):

$svn mkdir file:///home/f/temp/myrepos/trunk \

file:///home/f/temp/myrepos/branch \

file:///home/f/temp/myrepos/tags \

-m “initial layout of project codes”

(/home/f/temp/myrepos is the absolute path of my svn repo in this example) Where the trunk directory holds the main branch/version of your code/project, the branch directory holds the codes of your project wherein you considered altering a large part of it, or perhaps an extra/spin-off project from your main project/trunk. The tags directory usually holds the “snapshots” or milestones of your code/project in certain parts or history of its development, such as stable releases/versions, etc.

To check what you just did or to check any part of your svn repo, use:

$svn list file://URL

where you change URL to an absolute path in your sytem, such as /home/f/temp/myrepos/. The previous command gives the following output:

branch/
tags/
trunk/

If I have a directory named project_codes, with files

$ ls
index.php lib.h main.c

To add these codes to my svn repo under the trunk directory, and thus under svn’s version control management, I can use:

$ svn import ../project_codes/ file:///home/f/temp/myrepos/trunk \
-m “importing my project codes to svn trunk repo”

which outputs:

Adding ../project_codes/main.c
Adding ../project_codes/lib.h
Adding ../project_codes/index.php

Committed revision 2.

And to check, use list again:

$ svn list file:///home/f/temp/myrepos/trunk
index.php
lib.h
main.c

as usual, you can use the commands shown in svn help to administer your svn repo, never forgetting the syntax for the svn repo directory (svn command fiile://URL, where URL is an absolute path).

Lastly, there are 3 ways to serve your svn repo onto a network or to the Internet, but the ones I’ll be mentioning here are the easiest ones to setup, and most probably the ones in widest usage, and aren’t the least bit insecure (security-wise). But first, enter your svn repo directory, then into the conf directory to edit the file svnserve.conf. Make sure that the following lines are uncommented (without a ‘#’ at the beginning of the line):

anon-access = read
password-db = passwd

Where anon-access tells svn to let anonymous svn clients of your repo to be able to read only the files, not write over them (other valid values for this are write, and none). When you want your users to enter passwords before gaining access to your repo, password-db gets the users and their corresponding passwords from the file passwd under the conf directory. The passwd file is worth taking a look and has a very easy user management syntax. Under the [users] heading in the file passwd, it’s simply:

username = password

svnserve

svnserve (svnserve –help for some help or man svnserve for more info) is a lightweight server that allows svn clients to connect to your repo. You can use svnserve as a stand-alone background server or daemon process using:

$svnserve -d -r path

wherein path is a path to your root svn repo, in my case it’s /home/f/temp/myrepos.

svn over ssh

To include encryption (and security) into your svn repo transfers, you can tunnel svn access commands over ssh. This technique doesn’t actually need an svnserve running on the machine which houses the svn repo you want to access. The reason is that ssh first connects you to svn repo machine, then executes the svnserve command afterwords, so the process essentially logs you in via ssh and then runs svn commands for you (which of course allows you to do a lot more than just connect to the svn repo). To tunnel svn over ssh:

$svn checkout svn+ssh://user@host/path

wherein user is your user login to that svn repo machine, host is the hostname or IP address of the svn repo, and path is the absolute path to your svn repo on that host/machine. Once you manage to connect, you will be prompted with your password for that host/machine, not the password in the passwd file of the svn repo machine. Your succeeding transactions will require you to enter your password, with the benefit that your transfers to and from the svn repo machine are now encrypted.

Lastly, to allow editing of logs, in the directory /myrepository/hooks/, remove the file extension of the file/s with a .tmpl extension in order enable that hook, then make that file executable. In the svn client portion above, once the hook pre-revprop-change has been enabled, svn users can now edit their logs in case they made mistakes after committing, for example.

Comments and suggestions are highly welcome, as long as they’re in a calm and ruly way (^)__(^).

References and further reading:

Doing cool things in Bash and in Linux

February 29, 2008

Finally, I’m able to cut-in one last post for this month. I’ve some little free time and the things I’ve read and learned again inspired me to write another post.

This post is concerned with the cool things you can do with the bourne again shell or bash for short. If you want more info afterwards on other cool Linux and bash commands, you can consult my earlier post here. This post would probably mean it’s an extension of that previous post of mine. Difference is that bash can be installed not only in Linux but other Unix or *nix related operating systems such as Mac OS or solaris, and of course, Unix itself.

Anyhow to start of, I recently re-learned the wonderful use of brace expansion. Brace expansion is performed by issuing a command whose arguments are strings (alphanumeric for example) enclosed in curly braces and separated only by commas such as:

$ echo {s,sd,sdf}

Which outputs the following

s sd sdf

Note that there musn’t be any spaces inside the braces. But then that alone doesn’t seem to be anything marvelous right? So we extend that further to make it a bit more interesting such as these:

$ echo {red,yellow,blue,black,pink}_mask
red_mask yellow_mask blue_mask black_mask pink_mask

$ echo {"red ","yellow ","blue ","black ","pink "}mask
red mask yellow mask blue mask black mask pink mask

$ echo {red,yellow,blue,black,pink}" mask"
red mask yellow mask blue mask black mask pink mask

Or even nest braces like so:

$ echo {{yellow,red}_color,blue,green} yellow_color red_color blue green

At this part you might be saying to yourself the big “So what???” What use would we (Linux/Bash users, administrators, shell coders etc) have for brace expansion? A lot actually. One example is if I want to rename my files (for backup for example). Instead of typing a lot of file names (even with the help of tab completion, that is a lot of work, especially if you’re working with files from another directory). If say I have a configuration file at /home/f/F/RoR/sample_config_file.conf and I want to create a back up copy of the file on the same directory, I just have to do this

$ cp -v /home/f/F/RoR/sample_config_file.conf{,.bak} `/home/f/F/RoR/sample_config_file.conf' -> `/home/f/F/RoR/sample_config_file.conf.bak'

Which is equivalent to doing this

$ cp -v /home/f/F/RoR/sample_config_file.conf /home/f/F/RoR/sample_config_file.conf.bak `/home/f/F/RoR/sample_config_file.conf' -> `/home/f/F/RoR/sample_config_file.conf.bak'

So instead of executing the last command, the command using brace expressions is much shorter (and looks better to).

Next is command substitution which is very helpful at assigning values to variables and especially at shell scripting. Substitution is made possible by enclosing strings inside a $( ) combination. An example is the following

$ uname -a Linux foxhound3 2.6.22-14-generic #1 SMP Tue Feb 12 07:42:25 UTC 2008 i686 GNU/Linux

Using substitution I can do it this way

$ info=$(uname -a) $ echo $info Linux foxhound3 2.6.22-14-generic #1 SMP Tue Feb 12 07:42:25 UTC 2008 i686 GNU/Linux

Which might seem longer but you have to realize that a lot more can be done with this technique such as the next one. Next I issue a two commands: the output of the first (which is actually two commands joined together) becomes the input of the second (outer command) like so:

$ ls -lh $(find . |grep txt)

The inner commands in the previous command lists all files with the string txt in their file names. The outer command ls -lh shows the time stamp (creation/modification date), file owner, file size etc of the files containing the string txt in their file names.

Output redirection of standard error is next. When you execute a command which you know will produce a lot of errors such as using find to look for files on the topmost / (slash root) directory knowing full well that you don’t have read access to many directories, such as this one (which assumes you’re looking for the file php.ini):

$ find / -name php.ini

find: /etc/cups/ssl: Permission denied
find: /etc/ssl/private: Permission denied
find: /tmp/gconfd-root: Permission denied
find: /tmp/orbit-root: Permission denied
find: /var/cache/system-tools-backends/backup: Permission denied
find: /var/tmp/kdecache-guest: Permission denied
find: /var/tmp/kdecache-ma3x: Permission denied
find: /var/tmp/kdecache-root: Permission denied
find: /var/spool/postfix/flush: Permission denied
find: /var/spool/postfix/deferred: Permission denied
find: /var/spool/postfix/defer: Permission denied
find: /var/spool/postfix/active: Permission denied
find: /var/spool/postfix/trace: Permission denied
find: /var/spool/postfix/hold: Permission denied
find: /var/spool/postfix/private: Permission denied
find: /var/spool/postfix/saved: Permission denied
find: /var/spool/postfix/maildrop: Permission denied
find: /var/spool/postfix/corrupt: Permission denied
find: /var/spool/postfix/bounce: Permission denied

<other output truncated>

You get the idea. What you can do is to send them to /dev/null if you’re not interested in the error messages like so

$ find / -name php.ini 2> /dev/null /etc/php5/cli/php.ini /etc/php5/apache2/php.ini /etc/php5/cgi/php.ini

Which produces a cleaner output. Or if you want to view error messages later you can issue a

$ find / -name php.ini 2> error_messagest.txt

And then view the text file later for error messages. If you want to redirect both the error and standard output messages to the same file you can do a

$find / -name php.ini >output.txt 2>&1

The important thing to note here is that the combination of the two messages (error and standard) is done at the end of the command that generates the output (in this case find).

When it comes to searching through the history of bash commands executed, the usual way would be to scroll up or down using the arrow keys or enter the command history, view the history number of that long line of commands you entered, then do a !<history number of command here> to execute the command again without typing it. For example

$history

<output truncated>

531 cat output.txt 532 find / -name php.ini >output.txt 2>&1 533 history

then to execute command number 531

$!531

which is equivalent to executing

$cat output.txt

again. But what if your history of commands is already in the hundreds or above a thousand? What you can do is press ctrl and r at the same time which gives you this

$ (reverse-i-search)`':

A command history search. You can start typing the first letter of your previous command and then bash searches for the most recent and closest command you entered.

The last trick is looping. A lot of people think that looping is only good for writing programs, but one can actually put it to good use even in system administration, or plain ‘housekeeping’ of your files etc. For example, I have the files a.txt, b.txt, and c.txt in a directory. In order to make backup copies of them instantly, I do a

$ for something in *; do cp -v $something $something.bak; done

Which gives me the output

`a.txt' -> `a.txt.bak' `b.txt' -> `b.txt.bak' `c.txt' -> `c.txt.bak'

And voila! Instant backup of my files. One story I’ve read is that a system administrator’s machine lost so much memory that even basic commands like ls failed because of insufficient memory. But the administrators know that a certain file was the one wreaking havoc on the machine. So what they apparently did was this

$ for var in *; do echo $var;done

Which actually displays the files in the current directory, and basically solved the problem (replacing ls for the meantime). This is because those commands including echo are part of bash and are already loaded in memory. “Wonderful!” I said to myself (^)__(^)

Thanks to Linux Journal’s issue 132 in April 2005 (my back issue which I actually re-read) before writing this post.

Machine protection with iptables firewall

February 9, 2008

This mini tutorial aims to setup a firewall on a machine that will connect to the Internet, and perhaps serve some web pages on its web server. This mini tutorial was actually inspired by one of the Appendices I’m writing for my thesis, particularly the details of how I setup my firewall in the machine I’ll be using. To setup a firewall as a NAT gateway, you can consult this site, but this tutorial will still benefit you since almost all the things/commands we’ll be doing here can be applied to NAT firewall setup. For protection from malicious attackers breaking into the system and/or causing havoc, a firewall is definitely a must. To implement a firewall system, the iptables [1] utility created by Paul Russell was used. Paul Russell founded the Netfilter Core Team which provides an extensive manual and documentation for iptables. The iptables utility is usually the standard built in firewall utility in Linux distributions with kernel versions of 2.4 or higher. The network interface for the machine is given the name eth0.

In executing the commands (unless otherwise stated, which would mean it’s a configuration file and a ‘#’ means a comment) below preceded by a ‘#’ means either you need to be root or you’re using sudo to enter the commands.

A bit of warning though: if you’re setting up the following rules on a remote machine (and this is your first time using iptables) on which you don’t have any physical reach/contact (e.g. manually rebooting the machine), it’s best if you add this command on your crontab :

*/15 * * * * /sbin/iptables -F

Which ‘flushes’ or removes all iptables rules you’ve made every 15 minutes in case you get locked out of your remote machine when you’re experimenting and you make a mistake. That way, even if you get disconnected from your machine because of a mistaken rule you made, you can login again after the crontab job flushes all your rules. Of course this won’t be of much use if the machine your configuring iptables for is the local machine (or you have physical contact with the machine).

command

description

-A

Appends one or more rules to the end of the statement.

-I chain rulenum

Inserts chain at the location rulenum. Useful when one wants a rule

to supercede those before it.

-L

Lists all the rules in the current chain.

-F

Flush all the rules in the current chain, basically deleting the firewall

configuration

Table 1 – basic iptables commands [2]

rule specification

description

-p protocol

Specify protocol for the rule to match e.g. icmp, tcp, udp

-s address/mask!port

Specifies a certain address or network to match

-j target

This tells what to do with the packet if it matches the specifications.

The valid options for target are

DROP – Drop packet/s without any further action.

REJECT – Drop packet/s and send an error packet in return.

ACCEPT – Allow packets to enter the network interface

Table 2 – iptables basic rules specifications [2]

Initially, iptables rules are empty. Rules are the the firewall’s configuration/s, denying and/or accepting certain packets, for example. Checking the rules before any has been added:

# iptables -nvL
Chain INPUT (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination Chain FORWARD (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination Chain OUTPUT (policy ACCEPT 0K packets, 0 bytes) pkts bytes target prot opt in out source destination

Which basically says no rules have been applied yet. If in case there were previous rules and one wishes to start with a clean slate, the command

# iptables -F

Flushes the rules out. Careful notice has to be given to the uppercase and lowercase commands since iptables is [sic] case-sensitive.

The following setup is for a single non-gateway non-NAT (Network Address Translation) machine. All packets received by the network interface eth0 with destination address being the machine’s IP address pass through the INPUT chain/rule. Only wanted packets must be accepted to avoid attackers or DOS (Denial of Serivce) attacks et al.

First, create custom chains/rules which will become clearer as more chains/rules are given:

# iptables -N open # iptables -N interfaces

Accept ICMP messages such as pings:

# iptables -A INPUT -p icmp -j ACCEPT

Next is the rule that will make sure no traffic that belongs to already established connections will be dropped. This rule can be done by matching a given state of a connection. A connection can have one of the four states: ESTABLISHED, RELATED, NEW and INVALID. All packets/connections that are in state ESTABLISHED or RELATED should be accepted, turning the firewall into a “stateful firewall”:

# iptables -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT

Since not all incoming connections will be denied, more custom chains are put into place for the open and interfaces that have been created earlier:

# iptables -A INPUT -j interfaces # iptables -A INPUT -j open

Based on the last two rules, drop all traffic that hasn’t been explicitly accepted by the previous rules. TCP packet connections are denied with a tcp-reset. UDP packets are answered with an ICMP message. This method of replying to connections imitates Linux’s default behaviour:

# iptables -A INPUT -p tcp -j REJECT --reject-with tcp-reset
# iptables -A INPUT -p udp -j REJECT --reject-with icmp-port-unreachable

All protocols other than TCP, UDP and ICMP are dropped (unless they manage to match the state match from previous chains). This rule is done by setting the policy for the INPUT chain to DROP

# iptables -P INPUT DROP

Since the machine won’t function as a router/forwarding device, we set the policy of the FORWARD chain to DROP:

# iptables -P FORWARD DROP

There is no need to filter any outgoing traffic. Set the OUTPUT policy to ACCEPT.

# iptables -P OUTPUT ACCEPT

Use the interfaces chain to accept any traffic from trusted interfaces. The following rule is absolutely necessary:

# iptables -A interfaces -i lo -j ACCEPT

The previous rule accepts every traffic from the loopback interface, lo, which is necessary for many applications to work properly. Incoming connections on other interfaces will be denied, unless they hit another exception in the open chain.

The open chain contains rules for accepting incoming connections on specific ports/protocols. To accept ssh (default port is 22) connections on every interface

# iptables -A open -p tcp --dport 22 -j ACCEPT

Limited machines can be allowed to connect to port 22 by modifying the /etc/hosts.allow file. The local machine uses the port 8094 (arbitrarily chosen) to make ssh connections instead of the default port (default ssh port is 22):

# iptables -A open -i eth0 -p tcp --dport 80 -j ACCEPT

Next, force SYN packet checking. Make sure NEW incoming tcp connections are SYN packets (synchronization); otherwise drop them:

 # iptables -A INPUT -p tcp ! --syn -m state --state NEW -j DROP

Now, force fragmented packets to be checked. Packets with incoming fragments are dropped.

  #iptables -A INPUT -f -j DROP

Incoming malformed Christmas tree packets [3] are dropped:

#iptables -A INPUT -p tcp --tcp-flags ALL ALL -j DROP

As well as incoming malformed NULL packets:

#iptables -A INPUT -p tcp --tcp-flags ALL NONE -j DROP

To prevent spoofed traffic [4], block reserved private networks coming from the Internet

#iptables -I INPUT -i eth0 -s 10.0.0.0/8 -j DROP #iptables -I INPUT -i eth0 -s 172.16.0.0/12 -j DROP #iptables -I INPUT -i eth0 -s 192.168.0.0/16 -j DROP #iptables -I INPUT -i eth0 -s 127.0.0.0/8 -j DROP

The following line is added to the /etc/sysctl.conf configuration file to enable source address verification which is built into the Linux kernel itself.

  net.ipv4.conf.all.rp_filter = 1

In order to further lessen the network traffic that will be experienced by the machine’s network interface (eth0), specific types of unwanted ICMP packets [5] will be dropped

iptables -I INPUT -p icmp --icmp-type redirect -j DROP
#iptables -I INPUT -p icmp --icmp-type router-advertisement -j DROP
#iptables -I INPUT -p icmp --icmp-type router-solicitation -j DROP
iptables -I INPUT -p icmp --icmp-type address-mask-request -j DROP
#iptables -I INPUT -p icmp --icmp-type address-mask-reply -j DROP

Now, the rules should be saved. Different Linux distributions have different ways of saving iptables rules. The following comes from an Arch Linux setup. The configuration file /etc/conf.d/iptables is edited first for further security:

# Configuration for iptables rules IPTABLES=/usr/sbin/iptables IPTABLES_CONF=/etc/iptables/iptables.rules IPTABLES_FORWARD=0 # disable IP forwarding!!!

And then arbitrarily specify a filename such as iptables.rules where the rules will be saved according to the previous configuration file.

Now, save the rules with the command

# /etc/rc.d/iptables save

and to make sure the rules are loaded when the machine is rebooted, edit the /etc/rc.conf file, iptables should be added preferably before ‘network’.

 DAEMONS=(... iptables network ...)

For other Linux distros (Debian, Ubuntu, Fedora etc), issuing the command

iptables-save > /etc/firewall.conf

saves all your iptables rules on the arbitrary file firewall.conf. Then after a reboot, you can do a

iptables-restore < /etc/firewall.conf

To restore your iptables rules. Or you can just create a simple shell script like so

echo "#!/bin/sh" > /etc/network/if-up.d/iptables
echo "iptables-restore < /etc/firewall.conf" >> /etc/network/if-up.d/iptables
chmod +x /etc/network/if-up.d/iptables

To let ifup load your rules automatically for you. Run the previous shell script during boot-up of course.

Now you might want to log the packets that are dropped. A quick and simple way to log those packets on the file /var/log/syslog is this:

#iptables -I INPUT 5 -m limit --limit 5/min -j LOG --log-prefix "iptables denied: " --log-level 7

which is pretty self-explanatory.

Comments/suggestions/questions/reactions are welcome as long as they come in a calm and ruly way (^)__(^)

References:

[1] T. Howlett, Open Source Security Tools: Practical Applications for Security. Prentice Hall Professional Technical Reference, 2005

[2] P. Russell et al. Iptables manuals and documentations. http://www.netfilter.org/documentation/index.html (January 2008  )

[3] H. Bidgoli. The Internet Encyclopedia. John Wiley and Sons, 2004

[4] M. Freire and M. Pereira. Encyclopedia of Internet Technologies and Applications. Information Science Reference, 2008

[5] Internet Assigned Numbers Authority. ICMP type numbers RFC list. http://www.iana.org/assignments/icmp-parameters (January 2008  )

youtube download script update

February 8, 2008

Actually, there’s no real update at my older post regarding the python script I created for automatically downloading multiple youtube videos. And there’s also nothing wrong with it. Problem is, I tried downloading videos last week and this is the output of my python script:

Retrieving video webpage... done.Extracting URL "t" parameter... failed.Error: unable to extract URL "t" parameter.

Try again several times. It may be a temporary problem.

Other typical problems:* Video no longer exists.

* Video requires age confirmation but you did not provide an account.

* You provided the account data, but it is not valid.

* The connection was cut suddenly for some reason.

* YouTube changed their system, and the program no longer works.

Try to confirm you are able to view the video using a web browser.

Use the same video URL and account information, if needed, with this program.

When using a proxy, make sure http_proxy has http://host:port format.

Try again several times and contact me if the problem persists.

Which usually comes out when the youtube-dl code has been updated most probably because youtube made changes on how they serve their videos. Download/save the latest youtube-dl here and copy it again to your $PATH (in Linux) or you can again refer to my previous post about youtube-dl. Remember, no changes have been made to my python script, you only have to update the youtube-dl code.

After updating my copy of youtube-dl on my Linux box, I’m off to downloading more videos from youtube (^)__(^)