env. preparation

Uncategorized March 17th, 2008

Keeping many tools at hand is very necessary:

  • Windows under Ubuntu via VirtualBox
  • StarUML on Windows
  • Wine for light wight windows executables (e.g., StarUML)
  • Java & Web IDE and frameworks
  • db server, web server & app server
  • Project Planning Utilities: openproj & planner
  • Project configuration tools
  • Auto-build and release env.

Still needed:

  • free style drawing tool

During preparation, some tools did impress or even shock me a lot.

VirtualBox is simple and small, only 20+M, but powerful, have better performance than VMware workstation. Most importantly, it’s open sourced therefore totally free.

VMware Server 2.0 beta is unbelievable and ambitious. Its web infrastructure access shocked me a lot, not only for management of vms (ajax) but also for vm console directly from web browser. Speaking of web browser based vm console, no java/flash/silverlight/AIR is used, but platform dependent implementation of browser plugin is used. Remote access to all of these conveys the ambition or even revolutionary of this product. Enterprise level (not only for big giants, but also for small and medium business) virtualization is just around the corner. It also gives us another picture about thin client and SaaS. Where’s grid computing? I guess they may take the position of the underlying layer of virtualization and act as more a way for aggregating organization wide computing power than a way for providing charged public service. But wrt grid computing non-profit academic research has a different picture. We are really coming to an era of web and distributed system.

Virtual or virtualization is really a beautiful word! But actually it’s abstraction for coping with complexity.

Related Posts

Tags: , ,

JMX Study Notes

Java, notes March 10th, 2008

1. Overview

Java Management EXtension, used to manage resources (applications, services, device, etc) within your JVM locally (in-box) or remotely (out-of-box).

It is developed through JSR3 (JMX SPEC), JSR160 (extension for remote management).

2. Architecture

[MBean Server] Agent <—-> connector (various protocols) <—-> management application
————–|
————-\|/
[resources] MBean (instrumentation)
————–|
————-\|/
JVM (local or remote)

MBean (Managed Java object, similar to JavaBean) is registered to the Agent, and resources are wrapped within MBeans, which is called instrumentation in JMX SPEC.

Interfaces:

The management interface of an MBean consists of:

  • Named and typed attributes that can be read and/or written
  • Named and typed operations that can be invoked
  • Typed notifications that can be emitted by the MBean

MBeans can be instantiated and registered by:

  • Another MBean
  • The agent itself
  • A remote management application

 The operations available on MBeans include:

  • Discovering the management interface of MBeans
  • Reading and writing their attribute values
  • Performing operations defined by the MBeans
  • Getting notifications emitted by MBeans
  • Querying MBeans based on their object name or their attribute values

Management applications do the management throw agent services that operates on MBeans.

3. Resources

  1. Java Management Extensions (JMX)

Related Posts

Tags: , ,

Java Troubleshooting Utilities

Java, notes March 9th, 2008

jmap: prints memory statistics for a running jvm or core file

tips: use jps to show java processes summary currently running on host machine.

jmap -heap <pid>

heap memory statistics

jmap -permstat <pid>

permanent generation memory statistics

jmap -histo <pid>

This is super useful. It displays number of instances of each class and memory size they occupy.

jmap -dump:format=b,file=ConsoleInput.hprof <pid>

This is a must use. It can dump the runtime jvm into a binary file in such a format called hprof that another utility called jhat can analize

jhat: java hotspot analysis tool

jhat takes the heap dump file produced by jmap as input, and starts a lightweight local web server so that you can visit and query the analysis results. I have to say the analysis results is very detailed.

Just fire:

jhat <dump_file_name>

Reading from ConsoleInput.bin…
Dump file created Sun Mar 09 00:59:15 CST 2008
Snapshot read, resolving…
Resolving 5032 objects…
Chasing references, expect 1 dots.
Eliminating duplicate references.
Snapshot resolved.
Started HTTP server on port 7000
Server is ready.

And I visit the results within my firefox at http://localhost:7000, it shows:

Package name.kenyth.hello

class name.kenyth.hello.ConsoleInput [0×91149f98]

Other Queries

You can see you can query the results by writing your own OQL which is sql-style and therefore quite handy.

Others

Follow the link listed below, you can find even more.

 Resources

  1.  Troubleshooting and Diagnostic Guide. a good place to start investigating and experimenting.

Related Posts

Tags: , ,

JVM Structure Study Notes

Java, notes March 9th, 2008

1. Overview

Regardless of specific implementation, first take a look at what’s described in JVM SPEC. And most of post is talking about server VM, however, where applicable you can reference for client VM.

Given the assumption of having knowledge of conventional (from some point of view, Java itself may become a conventional language, so in this case it indicates those compiled language invented before Java) compiled languages and compiler/linker/loader, it is very easy to understand the internal structure (more accurately, the structure of the JVM at runtime), and may be helpful to better understand conventional compiled languages.

2. Abstract Structural Element

This section is to give concepts of elements that consists the runtime jvm structure. They’re supposed to be independent of specific data structure implementation. At the beginning of each element subsection, important properties are listed.

pc

  • private to each thread
  • execution (which means it supports for the execution)
  • created when each thread is created

Program counter, which indicates the jvm instruction currently being executed. jvm can support many threads of execution at once and each jvm thread has its own pc. Unlike many conventional language, Java programming language is pure OO, which means no real global “thing” are allowed. Giving some thoughts on the execution of a java program for which the main method is the entry in which the control may flow to constructor of some class, static method of some class, method of some class instance, and so on, you’ll quickly come to the conclusion, just as what’s stated in JVM SPEC, that at any point in time each jvm thread is executing the code of a single method. For normal (non-native) method, pc leads to the address the jvm instruction currently being executed, while for native method, pc is undefined which means it is totally up to specific jvm implementor.

Method Area

  • shared by threads
  • program (which means it is the “mapping” of your program into jvm)
  • created at jvm start-up

Analogous to runtime structure of a program written in conventional language, a code area exists in jvm structure. In jvm, it is called a method area, though it doesn’t only include code for method. It is shared by all threads of execution in jvm. It can contain per-class structure such as runtime constant pool and code of class itself, field data and method data, method (including constructor) code. In java terms, it contains the reflective data with regard to class and methods of a java program. This means as long as new classes (or in a special case, the interned string) are loaded, this area increases.

Constant Runtime Pool

  • private to each class
  • program
  • created at compile time and “mapped” when each class is loaded
  • part of method area

Refers to here (chapter 3, jvm spec) and here (chapter 4, jvm spec) for more information.

Frames

  • private to each thread (or more specifically to each method, correct me)
  • execution
  • created at method invocation
  • part of collection of frames

Only with method area (the code understandable by jvm), the program is not alive, since method area is static, which means the size is fixed at compile time or load time. So just like activation record with regard to conventional program, frames refer to the real execution of a program. Each thread of execution manages its own collection of frames that are not shared with other threads. This collection may itself be considered an abstract structure element which we will talk about later. Whenever a new invocation of method take place, a new frame is dynamically created, and discarded as the invocation completes, normally or abruptly. The method may be non-native or native.

Each frame consists of a series of local variable of the method, and the status of the method execution, based on which jvm can put forward the execution, whether return the control to the invoker or continue within the method. The frame’s creation and status moving forward, which consist of the execution, are totally decided by information of the corresponded method, stored in the method area, which is the program.

Collection of Frames

  • private to each thread
  • execution
  • created when each thread is created

As mentioned above, it maintains all ever created and then discarded frames of a thread during execution.

Shared Data Area

  • shared by all threads
  • execution
  • created at jvm start-up

All dynamically created data at runtime are stored at this area. It includes all class instances and arrays. This may correspond to the area in a conventional program where global variable and dynamically created data are stored.

Representation of Objects

  • shared by all threads and accessed according to visibility of the type of each object
  • execution
  • created
  • part of method area or shared data area

How does a real object is represented? Many abstract structural element may, via some way, have an access to this representation. Since each class has its own representation in jvm which is also an object, so in method area there also exist this kind of element. (correct me)

3. Data Structure Under the Hood

This section contains information that very likely is not very accurate and that is only based on my own knowledge. Please reference at your own risk.

pc

pc is a pointer which stored in register (logic register of jvm, or register on physical host machine). It may contain very little information. But what matters most is it must be very fast to access.

Method Area

Arrays, c-style structs, lists, tables (most likely an array of c-style structs), or cascaded tables.

Frames

local variables are stored in a table, and the status of the method execution is stored in a stack called operand stack. Executing instructions stored in method area may pop out from and/or push on stack the operands and/or results of operations.

Collection of Frames

Each collection is stored in a stack called JVM Stack. An invocation of a method push a newly created frame on the stack and completion of an invocation pops a frame out from the stack, and then the control returns to invoker’s frame. Each jvm thread has two kinds of stack: stack for java code and stack size for native code.

Shared Data Area

The shared data area is structured as a heap. GC consumes most of its time here.

Representation of Objects

Stored and organized on heap. Its internal structure may be c-style struct record.

Totally undefined which means it is completely subject to specific jvm implementation. Please skip this subsection.

But as I imagine, first get the representation of reference, and then the shared data area, and last the representation of objects.

As jvm spec stated, a reference is pointer to the address space that stores the address of the real object, which means it is similar to be of type pointer to a pointer.

4. More Under the Hood

This section contains information that very likely is not very accurate and that is only based on my own knowledge. Please reference at your own risk.

Heap and stack (and other data structures mentioned in this post) are not terms of jvm spec, they refer to underlying implementation of the data, or their counterpart terms in data structure theory. The purpose of introducing two concepts heap and stack and making the difference visible is that although they both can be dynamically allocated, workloads and policies of managing allocation and deallocation of these two kinds of data are quite different. Stack is quite manageable (which needs little workload and theoretically runs no risk of memory leak) and its size is easily inferred at compile time, while heap needs a quite complex way to manage (which needs so much workload that the runtime performance of the program based on it would be largely affected) and its size is most of time impossible to be inferred at compile time.

So below we divide memory as heap memory and non-heap memory, which is to follow Sun’s official document about JVM implementation.

Heap Memory

  • shared data area

Non-heap Memory

  • Java virtual machine stack
  • native method stack (for native method frames, correct me)
  • method area (logically belongs to, but adopt a quite different policy compared to runtime data area, so it is usually counted as non-heap memory), corresponding to Permanent Generation space of Sun’s implementation
  • internal processes (threads scheduler or dispatcher) or optimization

Garbage Collection

Few parts of JVM’s memory is of fixed size at runtime, they may from time to time be expanded and/or compacted. This work is done by memory manager. And garbage collector is a type of memory manager. It is used to destroy “unreferenced” objects to free memory space.

Sun’s implementation of jvm use a strategy called generational GC (Garbage Collection). It divides memory into several generation, from young generation (eden space and survivor space, together sometimes are called new generation) , to tenured generation (or old generation), to permanent generation. When GC is performed in young generation , it is called a partial GC, when performed in old generation, it is called a full GC. Partial GC is much faster than full GC (why). Each generation is sometimes called a memory pool.

However, heap memory and non-heap memory don’t seem to have a clean alignment with generations mentioned above. But roughly heap memory aligns with eden space, survivor space and tenured space, and non-heap memory aligns with permanent generation. Tuning the size of different different generation space and their ratio has big impacts on the performance of GC when your program runs in some demanding environment, say within a hosted web server. And what makes the matter complex, almost every abstract structure element mentioned in previous section has corresponding option to get tuned.

GC is also can be configured to use different algorithms. But this is out of the scope of this post.

Tuning JVM

One of the purpose of writing this post is to let your better tune your jvm, that is, give you good conceptions of which part of jvm are you tuning when you adjust some of the jvm options. Troubleshooting and Diagnostic Guide is a good starting point to tuning your jvm. Below only Sun’s implementation is talked about.

Tuning Heap

Note that in Sun’s different official document there exist conflicts that permanent generation space sometimes is counted as heap memory and sometimes as non-heap memory, for which we mentioned above method area logically belongs to heap memory although we put it into the category of non-heap memory.

  • -Xms[value] // initial size of the total heap, e.g., -Xms64m. This means a heap of this size will be occupied when the jvm is started up. If this amount of memory request couldn’t be meeted, the start-up would fail.
  • -Xmx[value] // maximum size of the total heap, e.g., -Xmx1024m. When requested more memory usually during creating new class instances or arrays than maximum heap size, an OutOfMemoryError will be thrown
  • -XX:MinHeapFreeRatio=minimum // the minimum proportion of free space to living objects (both of which make up the total heap), the lower limit of a range that each GC tries keep the proportion within
  • -XX:MaxHeapFreeRatio=maximum // the maximum proportion of free space to living objects, the upper limit of a range that each GC tries keep the proportion within
  • -XX:NewRatio=ratio // the proportion of new generation to old genertion,when heap grows or shrinks, jvm must recaculate the size of new and old generation
  • -XX:NewSize=size // the minimum size of new generation
  • -XX:MaxNewSize=size // the maximum size of new generation
  • -XX:OldSize=size // the size of old generation
  • -XX:SurvivorRatio=size // the proportion of survivor generation to eden generation
  • -XX:PermSize=size // minimum size of permanent generation
  • -XX:MaxPermSize=size // maximum size of permanent generation

Sun’s jvm has two equally sized survivor space, above tuning related to survivor is for each not for the two in total.

To make the matter clear, new generation and old generation definitions are redeclared here:

  • new generation = eden space + survivor space
  • old generation = tenured space
Tuning Others
  • -XX:ThreadStackSize=512
  • -Xss[value] // native method stack size, StackOverflowError will be thrown if this limit is exceeded
  • -Xoss[value] // jvm stack size, StackOverflowError will be thrown if this limit is exceeded

5. Resources

  1. JVM SPEC 2nd edition, a must read.
  2. Using JConsole. aside from how to use it, it also contains much information about the structure of sun’s implementation of jvm.
  3. Troubleshooting and Diagnostic Guide. a good place to start investigating and experimenting.
  4. Open JDK. source codes are available for further investigating and experimenting.
  5. Questions about JVM internals, a discussion on bbs by non-experts, a good starting point.
  6. My another blog posts related (in Chinese)
  7. Tuning the Java Runtime System
  8. Tuning Java Virtual Machine, IBM’s guide in WebSphere manual.
  9. Categories of Java HotSpot VM Options
  10. JDK Tools and Utilities

Related Posts

Tags: , , , , , , , ,

Sharing one dsl internet connection in a HUB-based LAN

Uncategorized January 29th, 2008

Since I’ve got a new laptop, I’d like to share one internet connection between it and my desktop pc, both of which dual-boot Linux (Ubuntu) and Windows (Vista/XP).

Until now I’ve only tested the sharing when both are under ubuntu, latter I’ll test it with one on Ubuntu (which serves as the router) and another on Windows. If in your case Windows machine are connecting to the Internet, then you may use Winows’ ICS (Internet Connection Sharing) which is very simple. But now I’m writing the cases where Ubuntu machine is connecting to the Internet.

Configuration:
laptop (ubuntu 7.10)
eth0 (192.168.0.1) <–> hub
ppp0 <–> Internet

desktop pc (ubuntu 7.10)
eth0 (192.168.0.8) <–> hub

The following steps will do this job:
on laptop:
1. enable NAT (network address translation, you may encounter this if you’re playing with vmware)

root:~> iptables -t nat -A POSTROUTING -o ppp0 -j MASQUERADE

only doing this, you may not get NAT enabled after you reboot your system. So add the following to /etc/network/interface (may be at the end of auto eth0 section):

pre-up iptables-restore < /etc/iptables.rules
post-down iptables-save > /etc/iptables.rules

optionally if it doesn’t work with all the others (including all those below), try this first before the others:

root:~> iptables -P FORWARD ACCEPT

2. enable ip forward. add this line to /etc/sysctrl.conf (may be at the end of this file):
net.ipv4.ip_forward = 1

On desktop pc
1. set laptop as the gateway of it

root:~> route add default gw 192.168.0.1

All done! Now reboot the laptop and make the dsl connection. To check everything you configured normal just do this:

root:~> cat /proc/sys/net/ipv4/ip_forward
1
root:~> iptables -t nat -L MASQUERADE POSTROUTING
target prot opt source destination
MASQUERADE 0 — anywhere anywhere

If you get the same output, then everything goes all right. Ping a website on the desktop pc to ensure the sharing succeed.

I did all the above following these:
http://forums.afterdawn.com/thread_view.cfm/616232
http://lindesk.com/2007/04/internet-connection-sharing-using-iptables/

The key point for understanding what these steps mean and do to your system is to understand the command iptables and the target (iptables term) MASQUERADE. I read the manual of iptables carefully and as I understand, enabling MASQUERADE (or NAT) replaces the source addresses of packets that the desktop pc sends out to the Internet with ppp0’s address and destination addresses of packets that are sent from the Internet to your desktop PC with your desktop pc’s address in the LAN. In addition “-o ppp0″ means all packets going out through ppp0 will be MASQUERADEed.

But the trick is how the laptop know which packets are sent to itself and which are sent to the desktop pc. If your ever somewhat investigate network connections in a LAN, you may find it is the port number that is the secrete.

To make things clear, iptables contains some rules (table of chains) to handle any packets going through. You can add, delete and modify these rules. Roughly, a rule is made up of two parts: 1) the criteria to match packets; 2) for the matched packets, what you wanna do with them, which is called a “target”. The rough syntax of the iptables command is:

iptables [-t table] [param] [some chain in the specified table] [option]

Many built-in chains (each chain is a chain of rules) contains some pre-defined criteria to match the packets. In this post, we just utilizes the built-in rules and pre-defined target (yes target can be separately defined).

Note: the terms and understandings may not be accurate, because It’s just my understanding without further deep investigation.

Related Posts

Tags: , , , , , ,

Finally decide to use the genkitheme for this blog

Uncategorized January 9th, 2008

This blog is created for the purpose of being as a CMS (Content Management System) for software development (actually I call myself a *designer* sometimes for fun, yes, software designer) I’ve involved in, technologies I’ve learned and am learning, sometimes resource collection, etc. In a word, it’s mainly about codes and all stuff related.

And like a similar blog previously I wrote, it will be written all in English. This is for the ease of avoiding translation new and non popular technology terms, which otherwise would be mixed with other text that is in my mother language: Chinese. Yes, it’s totally crap. Straightforwardly, I’m lazy. But don’t expect highly for my English. Maybe I call it English and you don’t. All crap!

Only one thing I wanna say is about the theme, named genkitheme (any idea about the name?) that I chose for this blog. Actually I first see it at Robert Mao’s blog. I just feel comfortable when I saw it. One of the best thing that I like it is that it has a flexible width so that wide enough area is there for the post content which is sometimes very important for code snippet and pic demonstration. I tried to find a better 3-column theme than this one but in vain.

And I’m also looking for a good Linux (I’m under ubuntu 7.1 for most of the time) desktop tool for writing blog. Previously I used Windows Live Writer for a long time for mainly for my another blog (it’s mainly about my life sharing with my family and intimate friends). It’s really the best one I’ve ever used (If you work under windows you may follow Robert’s path to reach it. I reached it before he made it :). Until I find that tool, I’ll write posts online.

And in case you are browsing this blog to find something valuable, I’d better tell something about my tech background so that you may not waste your time here. Currently, I’m not a guru in any area and I just started my career as a Java developer last June. Languages that I really know (used it for coding for a relatively long time) are c/c++ (ever wrote codes for some *baby* embedded system), delphi (not for .Net platform) and Java/AspectJ. For other many commonly used languages, I won’t say anything about them for they’re commonly used. I hold a master degree in CS and a bachelor degree in EE. I spent a lot of time researching in areas like programming languages, OO, AOP and software engineering theories (yes, theories). Recently I’m extremely interested in web related stuff. But you may consider me as beginner level. I’m very pleased to be able to work on some open source projects if given the opportunity. And also I’m looking for a career in a Internet company (especially startup company). I won’t repeat my resume here. It’s enough.

Last but not least, you can reach me by kenyth at gmail dot com.

Related Posts

Tags: , , , , ,

Hello world!

Uncategorized January 7th, 2008

Welcome to Kenyth.name. This is your first post. Edit or delete it, then start blogging!

Related Posts

Tags: